Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.lymphocyte.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.lymphocyte.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.lymphocyte.H.overlap.peaks.bed.gz', schema='chrombpnet')
Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.lymphocyte.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.lymphocyte.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.lymphocyte.H.overlap.peaks.bed.gz', schema='chrombpnet')

          chr    pos    end allele1 allele2      variant_id
2339836  chr1  10176  10177       A       C  chr1:10177:A:C
2339837  chr1  10246  10247       T       C  chr1:10247:T:C
2339838  chr1  10247  10248       A       T  chr1:10248:A:T
2339839  chr1  10249  10250       A       C  chr1:10250:A:C
2339840  chr1  10256  10257       A       C  chr1:10257:A:C
Variants table shape: (6565133, 6)

annotating with closest genes

     0      1      2  3  4               5     6       7       8       9   10  \
0  chr1  10176  10177  A  C  chr1:10177:A:C  chr1   65419   65420   OR4F5   0   
1  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  451677  451678  OR4F29   0   
2  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  686653  686654  OR4F16   0   
3  chr1  10246  10247  T  C  chr1:10247:T:C  chr1   65419   65420   OR4F5   0   
4  chr1  10246  10247  T  C  chr1:10247:T:C  chr1  451677  451678  OR4F29   0   

  11                 12              13      14  
0  +  ENSG00000186092.7  protein_coding   55243  
1  -  ENSG00000284733.2  protein_coding  441501  
2  -  ENSG00000284662.2  protein_coding  676477  
3  +  ENSG00000186092.7  protein_coding   55173  
4  -  ENSG00000284733.2  protein_coding  441431  
Closest genes table shape: (19695399, 15)

annotating with peak overlap

    chr     pos     end allele1 allele2       variant_id
0  chr1  181021  181022       C       T  chr1:181022:C:T
1  chr1  181112  181113       A       G  chr1:181113:A:G
2  chr1  779046  779047       G       A  chr1:779047:G:A
3  chr1  794298  794299       C       G  chr1:794299:C:G
4  chr1  827251  827252       T       A  chr1:827252:T:A
Peak overlap table shape: (31395, 6)


     chr    pos allele1 allele2       variant_id  logfc.mean  logfc.mean.pval  \
0  chr10  10425       A       C  chr10:10425:A:C   -0.004096         0.691101   
1  chr10  10559       G       A  chr10:10559:G:A   -0.029807         0.165440   
2  chr10  10904       C       G  chr10:10904:C:G   -0.041647         0.097548   
3  chr10  10980       G       C  chr10:10980:G:C    0.045037         0.074225   
4  chr10  10982       G       A  chr10:10982:G:A   -0.044970         0.085617   

   abs_logfc.mean  abs_logfc.mean.pval  jsd.mean  ...  \
0        0.005783             0.679980  0.005112  ...   
1        0.029807             0.158048  0.011798  ...   
2        0.041647             0.092995  0.012619  ...   
3        0.045037             0.078008  0.009986  ...   
4        0.044970             0.081527  0.013551  ...   

   logfc_x_jsd_x_active_allele_quantile.mean.pval  \
0                                        0.519076   
1                                        0.083025   
2                                        0.048038   
3                                        0.049621   
4                                        0.043279   

   abs_logfc_x_jsd_x_active_allele_quantile.mean  \
0                                       0.000018   
1                                       0.000253   
2                                       0.000457   
3                                       0.000399   
4                                       0.000551   

   abs_logfc_x_jsd_x_active_allele_quantile.mean.pval  closest_gene_1  \
0                                           0.509564            TUBB8   
1                                           0.078996            TUBB8   
2                                           0.045586            TUBB8   
3                                           0.052259            TUBB8   
4                                           0.041107            TUBB8   

   gene_distance_1  closest_gene_2  gene_distance_2  closest_gene_3  \
0            63738         ZMYND11           119664           DIP2C   
1            63604         ZMYND11           119530           DIP2C   
2            63259         ZMYND11           119185           DIP2C   
3            63183         ZMYND11           119109           DIP2C   
4            63181         ZMYND11           119107           DIP2C   

   gene_distance_3  peak_overlap  
0           679243         False  
1           679109         False  
2           678764         False  
3           678688         False  
4           678686         False  

[5 rows x 34 columns]
Annotation table shape: (6565133, 34)

DONE

Done Annotating

