Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/rare/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/rare/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.endothelial_cell.H.overlap.peaks.bed.gz', schema='chrombpnet')
Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/rare/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/rare/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.endothelial_cell.H.overlap.peaks.bed.gz', schema='chrombpnet')

          chr    pos    end allele1 allele2      variant_id
3230544  chr1  10126  10127       A       T  chr1:10127:A:T
3230545  chr1  10127  10128       A       G  chr1:10128:A:G
3230546  chr1  10133  10134       A       G  chr1:10134:A:G
3230547  chr1  10451  10452       A       G  chr1:10452:A:G
3230548  chr1  10452  10453       C       A  chr1:10453:C:A
Variants table shape: (9126367, 6)

annotating with closest genes

     0      1      2  3  4               5     6       7       8       9   10  \
0  chr1  10126  10127  A  T  chr1:10127:A:T  chr1   65419   65420   OR4F5   0   
1  chr1  10126  10127  A  T  chr1:10127:A:T  chr1  451677  451678  OR4F29   0   
2  chr1  10126  10127  A  T  chr1:10127:A:T  chr1  686653  686654  OR4F16   0   
3  chr1  10127  10128  A  G  chr1:10128:A:G  chr1   65419   65420   OR4F5   0   
4  chr1  10127  10128  A  G  chr1:10128:A:G  chr1  451677  451678  OR4F29   0   

  11                 12              13      14  
0  +  ENSG00000186092.7  protein_coding   55293  
1  -  ENSG00000284733.2  protein_coding  441551  
2  -  ENSG00000284662.2  protein_coding  676527  
3  +  ENSG00000186092.7  protein_coding   55292  
4  -  ENSG00000284733.2  protein_coding  441550  
Closest genes table shape: (27379101, 15)

annotating with peak overlap

    chr    pos    end allele1 allele2      variant_id
0  chr1  10451  10452       A       G  chr1:10452:A:G
1  chr1  10452  10453       C       A  chr1:10453:C:A
2  chr1  10456  10457       A       G  chr1:10457:A:G
3  chr1  10461  10462       T       G  chr1:10462:T:G
4  chr1  10484  10485       G       A  chr1:10485:G:A
Peak overlap table shape: (169543, 6)


     chr    pos allele1 allele2       variant_id  logfc.mean  logfc.mean.pval  \
0  chr10  10394       A       T  chr10:10394:A:T    0.011130         0.448530   
1  chr10  10418       A       C  chr10:10418:A:C   -0.011252         0.490913   
2  chr10  10532       G       A  chr10:10532:G:A    0.013238         0.626457   
3  chr10  10573       G       A  chr10:10573:G:A   -0.016929         0.594241   
4  chr10  10663       G       C  chr10:10663:G:C    0.123258         0.069486   

   abs_logfc.mean  abs_logfc.mean.pval  jsd.mean  ...  \
0        0.023667             0.458913  0.009125  ...   
1        0.023866             0.484475  0.009785  ...   
2        0.013238             0.647218  0.008381  ...   
3        0.016929             0.581884  0.010019  ...   
4        0.123258             0.073474  0.026175  ...   

   logfc_x_jsd_x_active_allele_quantile.mean.pval  \
0                                        0.123885   
1                                        0.138863   
2                                        0.162066   
3                                        0.183019   
4                                        0.011900   

   abs_logfc_x_jsd_x_active_allele_quantile.mean  \
0                                       0.000135   
1                                       0.000124   
2                                       0.000062   
3                                       0.000097   
4                                       0.002362   

   abs_logfc_x_jsd_x_active_allele_quantile.mean.pval  closest_gene_1  \
0                                           0.128215            TUBB8   
1                                           0.136541            TUBB8   
2                                           0.171475            TUBB8   
3                                           0.174280            TUBB8   
4                                           0.012745            TUBB8   

   gene_distance_1  closest_gene_2  gene_distance_2  closest_gene_3  \
0            63769         ZMYND11           119695           DIP2C   
1            63745         ZMYND11           119671           DIP2C   
2            63631         ZMYND11           119557           DIP2C   
3            63590         ZMYND11           119516           DIP2C   
4            63500         ZMYND11           119426           DIP2C   

   gene_distance_3  peak_overlap  
0           679274         False  
1           679250         False  
2           679136         False  
3           679095         False  
4           679005         False  

[5 rows x 34 columns]
Annotation table shape: (9126367, 34)

DONE

Done Annotating

