Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.adipocyte.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.adipocyte.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.adipocyte.H.overlap.peaks.bed.gz', schema='chrombpnet')
Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.adipocyte.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.adipocyte.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.adipocyte.H.overlap.peaks.bed.gz', schema='chrombpnet')

          chr    pos    end allele1 allele2      variant_id
2339836  chr1  10176  10177       A       C  chr1:10177:A:C
2339837  chr1  10246  10247       T       C  chr1:10247:T:C
2339838  chr1  10247  10248       A       T  chr1:10248:A:T
2339839  chr1  10249  10250       A       C  chr1:10250:A:C
2339840  chr1  10256  10257       A       C  chr1:10257:A:C
Variants table shape: (6565133, 6)

annotating with closest genes

     0      1      2  3  4               5     6       7       8       9   10  \
0  chr1  10176  10177  A  C  chr1:10177:A:C  chr1   65419   65420   OR4F5   0   
1  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  451677  451678  OR4F29   0   
2  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  686653  686654  OR4F16   0   
3  chr1  10246  10247  T  C  chr1:10247:T:C  chr1   65419   65420   OR4F5   0   
4  chr1  10246  10247  T  C  chr1:10247:T:C  chr1  451677  451678  OR4F29   0   

  11                 12              13      14  
0  +  ENSG00000186092.7  protein_coding   55243  
1  -  ENSG00000284733.2  protein_coding  441501  
2  -  ENSG00000284662.2  protein_coding  676477  
3  +  ENSG00000186092.7  protein_coding   55173  
4  -  ENSG00000284733.2  protein_coding  441431  
Closest genes table shape: (19695399, 15)

annotating with peak overlap

    chr      pos      end allele1 allele2        variant_id
0  chr1   817006   817007       A       C   chr1:817007:A:C
1  chr1   817185   817186       G       A   chr1:817186:G:A
2  chr1   817340   817341       A       G   chr1:817341:A:G
3  chr1   817513   817514       T       C   chr1:817514:T:C
4  chr1  1000078  1000079       A       G  chr1:1000079:A:G
Peak overlap table shape: (13824, 6)


     chr    pos allele1 allele2       variant_id  logfc.mean  logfc.mean.pval  \
0  chr10  10425       A       C  chr10:10425:A:C    0.003013         0.531010   
1  chr10  10559       G       A  chr10:10559:G:A   -0.007400         0.173187   
2  chr10  10904       C       G  chr10:10904:C:G   -0.008333         0.147322   
3  chr10  10980       G       C  chr10:10980:G:C   -0.003346         0.489069   
4  chr10  10982       G       A  chr10:10982:G:A   -0.000129         0.664426   

   abs_logfc.mean  abs_logfc.mean.pval  jsd.mean  ...  \
0        0.003013             0.548033  0.005727  ...   
1        0.007400             0.166490  0.014610  ...   
2        0.008333             0.141404  0.016449  ...   
3        0.003362             0.477695  0.009726  ...   
4        0.001810             0.661127  0.007433  ...   

   logfc_x_jsd_x_active_allele_quantile.mean.pval  \
0                                        0.522545   
1                                        0.107217   
2                                        0.047391   
3                                        0.249319   
4                                        0.372551   

   abs_logfc_x_jsd_x_active_allele_quantile.mean  \
0                                       0.000005   
1                                       0.000050   
2                                       0.000102   
3                                       0.000025   
4                                       0.000011   

   abs_logfc_x_jsd_x_active_allele_quantile.mean.pval  closest_gene_1  \
0                                           0.541026            TUBB8   
1                                           0.101830            TUBB8   
2                                           0.044877            TUBB8   
3                                           0.241403            TUBB8   
4                                           0.370679            TUBB8   

   gene_distance_1  closest_gene_2  gene_distance_2  closest_gene_3  \
0            63738         ZMYND11           119664           DIP2C   
1            63604         ZMYND11           119530           DIP2C   
2            63259         ZMYND11           119185           DIP2C   
3            63183         ZMYND11           119109           DIP2C   
4            63181         ZMYND11           119107           DIP2C   

   gene_distance_3  peak_overlap  
0           679243         False  
1           679109         False  
2           678764         False  
3           678688         False  
4           678686         False  

[5 rows x 34 columns]
Annotation table shape: (6565133, 34)

DONE

Done Annotating

