Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.endothelial_cell.H.overlap.peaks.bed.gz', schema='chrombpnet')
Namespace(genes='/oak/stanford/groups/akundaje/soumyak/refs/gencode/hg38/hg38.gencode.protein_coding.tss.bed', list='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_summary/common/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H.mean.variant_scores.tsv', out_prefix='/oak/stanford/groups/akundaje/projects/neuro-variants/variant_annotations/common/encode_2024/K562_bias/encode_2024.LV.endothelial_cell.H', peaks='/oak/stanford/groups/akundaje/projects/neuro-variants/data/processed/encode_2024/peaks/overlap/encode_2024.LV.endothelial_cell.H.overlap.peaks.bed.gz', schema='chrombpnet')

          chr    pos    end allele1 allele2      variant_id
2339836  chr1  10176  10177       A       C  chr1:10177:A:C
2339837  chr1  10246  10247       T       C  chr1:10247:T:C
2339838  chr1  10247  10248       A       T  chr1:10248:A:T
2339839  chr1  10249  10250       A       C  chr1:10250:A:C
2339840  chr1  10256  10257       A       C  chr1:10257:A:C
Variants table shape: (6565133, 6)

annotating with closest genes

     0      1      2  3  4               5     6       7       8       9   10  \
0  chr1  10176  10177  A  C  chr1:10177:A:C  chr1   65419   65420   OR4F5   0   
1  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  451677  451678  OR4F29   0   
2  chr1  10176  10177  A  C  chr1:10177:A:C  chr1  686653  686654  OR4F16   0   
3  chr1  10246  10247  T  C  chr1:10247:T:C  chr1   65419   65420   OR4F5   0   
4  chr1  10246  10247  T  C  chr1:10247:T:C  chr1  451677  451678  OR4F29   0   

  11                 12              13      14  
0  +  ENSG00000186092.7  protein_coding   55243  
1  -  ENSG00000284733.2  protein_coding  441501  
2  -  ENSG00000284662.2  protein_coding  676477  
3  +  ENSG00000186092.7  protein_coding   55173  
4  -  ENSG00000284733.2  protein_coding  441431  
Closest genes table shape: (19695399, 15)

annotating with peak overlap

    chr     pos     end allele1 allele2       variant_id
0  chr1   10491   10492       C       T   chr1:10492:C:T
1  chr1   10621   10622       T       G   chr1:10622:T:G
2  chr1   10622   10623       T       C   chr1:10623:T:C
3  chr1  181021  181022       C       T  chr1:181022:C:T
4  chr1  181112  181113       A       G  chr1:181113:A:G
Peak overlap table shape: (100162, 6)


     chr    pos allele1 allele2       variant_id  logfc.mean  logfc.mean.pval  \
0  chr10  10425       A       C  chr10:10425:A:C   -0.008337         0.575678   
1  chr10  10559       G       A  chr10:10559:G:A   -0.002142         0.550232   
2  chr10  10904       C       G  chr10:10904:C:G   -0.020726         0.508342   
3  chr10  10980       G       C  chr10:10980:G:C    0.041367         0.286329   
4  chr10  10982       G       A  chr10:10982:G:A    0.003445         0.454177   

   abs_logfc.mean  abs_logfc.mean.pval  jsd.mean  ...  \
0        0.017229             0.572053  0.012037  ...   
1        0.016713             0.548432  0.011517  ...   
2        0.020726             0.501633  0.012018  ...   
3        0.041367             0.291265  0.012274  ...   
4        0.023642             0.454320  0.007906  ...   

   logfc_x_jsd_x_active_allele_quantile.mean.pval  \
0                                        0.156452   
1                                        0.130315   
2                                        0.108448   
3                                        0.053184   
4                                        0.109024   

   abs_logfc_x_jsd_x_active_allele_quantile.mean  \
0                                       0.000125   
1                                       0.000118   
2                                       0.000200   
3                                       0.000439   
4                                       0.000141   

   abs_logfc_x_jsd_x_active_allele_quantile.mean.pval  closest_gene_1  \
0                                           0.154898            TUBB8   
1                                           0.129843            TUBB8   
2                                           0.106303            TUBB8   
3                                           0.054481            TUBB8   
4                                           0.109097            TUBB8   

   gene_distance_1  closest_gene_2  gene_distance_2  closest_gene_3  \
0            63738         ZMYND11           119664           DIP2C   
1            63604         ZMYND11           119530           DIP2C   
2            63259         ZMYND11           119185           DIP2C   
3            63183         ZMYND11           119109           DIP2C   
4            63181         ZMYND11           119107           DIP2C   

   gene_distance_3  peak_overlap  
0           679243         False  
1           679109         False  
2           678764         False  
3           678688         False  
4           678686         False  

[5 rows x 34 columns]
Annotation table shape: (6565133, 34)

DONE

Done Annotating

