Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.mast_cell.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.mast_cell.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.mast_cell.H.overlap.peaks.bed.gz', schema='chrombpnet')
Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.mast_cell.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.mast_cell.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.mast_cell.H.overlap.peaks.bed.gz', schema='chrombpnet')

          chr    pos    end allele1 allele2      variant_id
2339836  chr1  10176  10177       A       C  chr1:10177:A:C
2339837  chr1  10246  10247       T       C  chr1:10247:T:C
2339838  chr1  10247  10248       A       T  chr1:10248:A:T
2339839  chr1  10249  10250       A       C  chr1:10250:A:C
2339840  chr1  10256  10257       A       C  chr1:10257:A:C
Variants table shape: (6565133, 6)

annotating with closest genes

     0      1      2  3  4               5     6       7       8       9   10  \
0  chr1  10176  10177  A  C  chr1:10177:A:C  chr1   65419   65420   OR4F5   0   
1  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  451677  451678  OR4F29   0   
2  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  686653  686654  OR4F16   0   
3  chr1  10246  10247  T  C  chr1:10247:T:C  chr1   65419   65420   OR4F5   0   
4  chr1  10246  10247  T  C  chr1:10247:T:C  chr1  451677  451678  OR4F29   0   

  11                 12              13      14  
0  +  ENSG00000186092.7  protein_coding   55243  
1  -  ENSG00000284733.2  protein_coding  441501  
2  -  ENSG00000284662.2  protein_coding  676477  
3  +  ENSG00000186092.7  protein_coding   55173  
4  -  ENSG00000284733.2  protein_coding  441431  
Closest genes table shape: (19695399, 15)

annotating with peak overlap

    chr     pos     end allele1 allele2       variant_id
0  chr1   10621   10622       T       G   chr1:10622:T:G
1  chr1   10622   10623       T       C   chr1:10623:T:C
2  chr1  779046  779047       G       A  chr1:779047:G:A
3  chr1  817340  817341       A       G  chr1:817341:A:G
4  chr1  817513  817514       T       C  chr1:817514:T:C
Peak overlap table shape: (41700, 6)


     chr    pos allele1 allele2       variant_id  logfc.mean  logfc.mean.pval  \
0  chr10  10425       A       C  chr10:10425:A:C    0.003491         0.790202   
1  chr10  10559       G       A  chr10:10559:G:A   -0.035676         0.224042   
2  chr10  10904       C       G  chr10:10904:C:G   -0.052010         0.123764   
3  chr10  10980       G       C  chr10:10980:G:C    0.047413         0.133611   
4  chr10  10982       G       A  chr10:10982:G:A   -0.004187         0.566921   

   abs_logfc.mean  abs_logfc.mean.pval  jsd.mean  ...  \
0        0.004646             0.799752  0.007193  ...   
1        0.035676             0.216485  0.013143  ...   
2        0.052010             0.118854  0.010475  ...   
3        0.047413             0.139063  0.009636  ...   
4        0.011423             0.562682  0.008463  ...   

   logfc_x_jsd_x_active_allele_quantile.mean.pval  \
0                                        0.595177   
1                                        0.111813   
2                                        0.073039   
3                                        0.078887   
4                                        0.298199   

   abs_logfc_x_jsd_x_active_allele_quantile.mean  \
0                                       0.000020   
1                                       0.000370   
2                                       0.000455   
3                                       0.000391   
4                                       0.000081   

   abs_logfc_x_jsd_x_active_allele_quantile.mean.pval  closest_gene_1  \
0                                           0.604049            TUBB8   
1                                           0.108003            TUBB8   
2                                           0.070375            TUBB8   
3                                           0.081971            TUBB8   
4                                           0.295758            TUBB8   

   gene_distance_1  closest_gene_2  gene_distance_2  closest_gene_3  \
0            63738         ZMYND11           119664           DIP2C   
1            63604         ZMYND11           119530           DIP2C   
2            63259         ZMYND11           119185           DIP2C   
3            63183         ZMYND11           119109           DIP2C   
4            63181         ZMYND11           119107           DIP2C   

   gene_distance_3  peak_overlap  
0           679243         False  
1           679109         False  
2           678764         False  
3           678688         False  
4           678686         False  

[5 rows x 34 columns]
Annotation table shape: (6565133, 34)

DONE

Done Annotating

