In [1]:
# Parameters
data_name = "immatureB_SPLND;CD43-_CD11b-;D;En"
modisco_root = "/srv/scratch/msharmin/mouse_hem/with_tfd/full_mouse50/Naive_modisco2019"
tomtom_report_root = "http://mitra.stanford.edu/kundaje/msharmin/report/tomtom_outs/cells"
task_dir = "task_242-naivegw"
perf_file = "/srv/scratch/msharmin/mouse_hem/with_tfd/full_mouse50/fineFactorized/task_242-naivegw/NaiveauPRC.txt"
homer_root = "/srv/scratch/msharmin/mouse_hem/with_tfd/full_mouse50/Naive_scans"
reportfile = "/mnt/lab_data/kundaje/msharmin/annotations/filtering samples_MS2.xlsx"
sheetname = "filter23"
In [2]:
from matlas.reports import display_metadata
load data from labcluster
Using TensorFlow backend.
2019-08-30 19:34:47,506 [WARNING] git-lfs not installed
In [3]:
display_metadata(data_name, perf_file, reportfile, sheetname)
    Sample Information
    MetaData NameDescription
    Cell typemouse spleen B cells, CD43-,CD11b- (immature B cells)
    Cell GroupB cells
    Experiment NameDHS
    Experiment GroupENCODE
    Pipeline Output
    replicateNaïve overlap peaksIDR peaksTSS enrichment (< 8 is very poor <10 is low)Final number of unique mapping, dup-filtered, chrM filtered readsNumber of reads in called peak regionsFraction of reads in called peak regionsNumber of reads in promoter regionsFraction of reads in promoter regionsNumber of reads in enhancer regionsFraction of reads in enhancer regions
    rep1128596956297.0455180923266NANANANANANA
    rep21285969562917.34112216610240753100.18430845500.139379526770.3591
    rep31285969562913.78482607664845123480.173235634390.136794600040.363
    rep41285969562918.68162499190260756720.243344639290.178791416050.366
    Modelling Metadata
    MetricValue
    auPRC0.6259
    Calibrated Recall at 50% FDR0.198
    Number of Positive Examples in Test Data107961
    Number of Negative Examples in Test Data7962890
    Imbalance Ratio in Test Data0.0134
    Test Chromosomeschr2, chr3, chr19
In [4]:
from matlas.reports import display_paiwise_pattern_comparison
from matlas.reports import display_denovo_patterns
In [5]:
display_denovo_patterns(data_name, modisco_root=modisco_root)
TF-MoDISco is using the TensorFlow backend.
The following two links show list of Denovo Patterns and corresponding Motifs discovered by TF-MoDISco
Click here for Denovo Patterns by TF-MoDISco: #6
Pattern NameTF Name(s)Modisco
metacluster_1/pattern_0 # seqlets: 14582 SequenceContrib ScoresHyp_Contrib Scores
Ctcf, Ctcfl
metacluster_1/pattern_1 # seqlets: 2916 SequenceContrib ScoresHyp_Contrib Scores
Spib, Irf8, Irf4, Spi1, Irf2, Irf1, Prdm1, Irf3, Elf5, Elf3,

Spic, Stat1, Ets1, Ehf, Stat2, Zscan10, Etv2, Etv4, Etv6, Gabpa, Irf7,

Erg, Elf1, Elk1, Elf4, Zkscan5, Elf2, Etv3, Fli1, XP_911724.4, Bcl6
metacluster_1/pattern_2 # seqlets: 1801 SequenceContrib ScoresHyp_Contrib Scores
metacluster_1/pattern_3 # seqlets: 536 SequenceContrib ScoresHyp_Contrib Scores
Zfp143, Thap11, Tbx2
metacluster_1/pattern_4 # seqlets: 95 SequenceContrib ScoresHyp_Contrib Scores
Nrf1, Sp2, Sp3, Zfx, Sp1, Klf3
metacluster_1/pattern_5 # seqlets: 70 SequenceContrib ScoresHyp_Contrib Scores
Rest
Click here for Motifs by TF-MoDISco: #43
TF NamePattern(s)
Ctcf
Pattern NameModiscoSignificance
metacluster_1/pattern_09.67664e-15
metacluster_1/pattern_20.00868994
Ctcfl
Pattern NameModiscoSignificance
metacluster_1/pattern_04.01851e-07
Spib
Pattern NameModiscoSignificance
metacluster_1/pattern_12.55566e-11
Irf8
Pattern NameModiscoSignificance
metacluster_1/pattern_14.82828e-09
Irf4
Pattern NameModiscoSignificance
metacluster_1/pattern_17.080079999999999e-09
Spi1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00143075
Irf2
Pattern NameModiscoSignificance
metacluster_1/pattern_15.41265e-05
Irf1
Pattern NameModiscoSignificance
metacluster_1/pattern_15.81146e-05
Prdm1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.000587692
Irf3
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0006764869999999999
Elf5
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00143075
Elf3
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0373221
Spic
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00604664
Stat1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00311991
Ets1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.03548219999999999
Ehf
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00658403
Stat2
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00687338
Zscan10
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00838074
Etv2
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00852348
Etv4
Pattern NameModiscoSignificance
metacluster_1/pattern_10.013817
Etv6
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0224686
Gabpa
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0393289
Irf7
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0224686
Erg
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Elf1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0230226
Elk1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Elf4
Pattern NameModiscoSignificance
metacluster_1/pattern_10.031221199999999998
Zkscan5
Pattern NameModiscoSignificance
metacluster_1/pattern_10.03548219999999999
Elf2
Pattern NameModiscoSignificance
metacluster_1/pattern_10.03548219999999999
Etv3
Pattern NameModiscoSignificance
metacluster_1/pattern_10.036382
Fli1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
XP_911724.4
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Bcl6
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0516013
Zfp143
Pattern NameModiscoSignificance
metacluster_1/pattern_37.514889999999999e-20
Thap11
Pattern NameModiscoSignificance
metacluster_1/pattern_31.64968e-18
Tbx2
Pattern NameModiscoSignificance
metacluster_1/pattern_32.1343400000000002e-13
Nrf1
Pattern NameModiscoSignificance
metacluster_1/pattern_46.23887e-07
Sp2
Pattern NameModiscoSignificance
metacluster_1/pattern_40.000233996
Sp3
Pattern NameModiscoSignificance
metacluster_1/pattern_40.00289895
Zfx
Pattern NameModiscoSignificance
metacluster_1/pattern_40.0293327
Sp1
Pattern NameModiscoSignificance
metacluster_1/pattern_40.010895
Klf3
Pattern NameModiscoSignificance
metacluster_1/pattern_40.027145299999999997
Rest
Pattern NameModiscoSignificance
metacluster_1/pattern_58.12203e-17
In [6]:
display_paiwise_pattern_comparison(data_name, modisco_root, homer_root)
Number of CISBP TFs obtained by TF-MoDISco and Homer
Shared TFs between TF-MoDISco and Homer: #37
TF NameModiscoHomer
Prdm1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.000587692
Pattern NameHomerSignificance
motif2.motif0.00822922
motif11.motif0.00620464
Ctcf
Pattern NameModiscoSignificance
metacluster_1/pattern_09.67664e-15
metacluster_1/pattern_20.00868994
Pattern NameHomerSignificance
motif1.motif2.50558e-06
Stat2
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00687338
Pattern NameHomerSignificance
motif11.motif0.055240599999999994
Erg
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Pattern NameHomerSignificance
motif2.motif9.689969999999999e-05
Etv3
Pattern NameModiscoSignificance
metacluster_1/pattern_10.036382
Pattern NameHomerSignificance
motif2.motif0.00011507600000000001
Zfp143
Pattern NameModiscoSignificance
metacluster_1/pattern_37.514889999999999e-20
Pattern NameHomerSignificance
motif4.motif3.07517e-07
Etv2
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00852348
Pattern NameHomerSignificance
motif2.motif0.00011507600000000001
Elf3
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0373221
Pattern NameHomerSignificance
motif2.motif0.00607041
Elf5
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00143075
Pattern NameHomerSignificance
motif2.motif0.025990199999999998
Klf3
Pattern NameModiscoSignificance
metacluster_1/pattern_40.027145299999999997
Pattern NameHomerSignificance
motif3.motif0.000584509
Spi1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00143075
Pattern NameHomerSignificance
motif2.motif0.00035435800000000004
XP_911724.4
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Pattern NameHomerSignificance
motif2.motif0.000138799
Nrf1
Pattern NameModiscoSignificance
metacluster_1/pattern_46.23887e-07
Pattern NameHomerSignificance
motif5.motif7.76569e-06
Spic
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00604664
Pattern NameHomerSignificance
motif2.motif0.000244864
Ctcfl
Pattern NameModiscoSignificance
metacluster_1/pattern_04.01851e-07
Pattern NameHomerSignificance
motif1.motif0.00014188
Irf1
Pattern NameModiscoSignificance
metacluster_1/pattern_15.81146e-05
Pattern NameHomerSignificance
motif2.motif0.013435599999999999
motif11.motif0.00113946
Irf8
Pattern NameModiscoSignificance
metacluster_1/pattern_14.82828e-09
Pattern NameHomerSignificance
motif2.motif0.000138799
motif11.motif0.000890062
Irf7
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0224686
Pattern NameHomerSignificance
motif11.motif0.0144977
Elf4
Pattern NameModiscoSignificance
metacluster_1/pattern_10.031221199999999998
Pattern NameHomerSignificance
motif2.motif0.000304878
Sp3
Pattern NameModiscoSignificance
metacluster_1/pattern_40.00289895
Pattern NameHomerSignificance
motif3.motif0.000584509
Tbx2
Pattern NameModiscoSignificance
metacluster_1/pattern_32.1343400000000002e-13
Pattern NameHomerSignificance
motif4.motif2.73732e-06
Sp1
Pattern NameModiscoSignificance
metacluster_1/pattern_40.010895
Pattern NameHomerSignificance
motif3.motif0.00373183
Elk1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Pattern NameHomerSignificance
motif2.motif0.000244864
Etv6
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0224686
Pattern NameHomerSignificance
motif2.motif0.00358039
Gabpa
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0393289
Pattern NameHomerSignificance
motif2.motif0.00276605
Thap11
Pattern NameModiscoSignificance
metacluster_1/pattern_31.64968e-18
Pattern NameHomerSignificance
motif4.motif3.07517e-07
Spib
Pattern NameModiscoSignificance
metacluster_1/pattern_12.55566e-11
Pattern NameHomerSignificance
motif2.motif5.85385e-05
motif11.motif0.0116335
Sp2
Pattern NameModiscoSignificance
metacluster_1/pattern_40.000233996
Pattern NameHomerSignificance
motif3.motif0.000584509
Elf1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0230226
Pattern NameHomerSignificance
motif2.motif0.000120535
Irf2
Pattern NameModiscoSignificance
metacluster_1/pattern_15.41265e-05
Pattern NameHomerSignificance
motif2.motif0.00707709
motif11.motif0.000946883
Elf2
Pattern NameModiscoSignificance
metacluster_1/pattern_10.03548219999999999
Pattern NameHomerSignificance
motif2.motif0.0230769
Etv4
Pattern NameModiscoSignificance
metacluster_1/pattern_10.013817
Pattern NameHomerSignificance
motif2.motif0.000244864
Fli1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.039539099999999994
Pattern NameHomerSignificance
motif2.motif0.000399691
Irf4
Pattern NameModiscoSignificance
metacluster_1/pattern_17.080079999999999e-09
Pattern NameHomerSignificance
motif2.motif0.000138799
motif11.motif0.0116335
Ets1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.03548219999999999
Pattern NameHomerSignificance
motif2.motif9.36374e-05
Irf3
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0006764869999999999
Pattern NameHomerSignificance
motif2.motif0.013435599999999999
Ehf
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00658403
Pattern NameHomerSignificance
motif2.motif0.00162256
Unique TF-MoDISco TFs: #6
TF NameModiscoHomer
Zkscan5
Pattern NameModiscoSignificance
metacluster_1/pattern_10.03548219999999999
Absent
Zfx
Pattern NameModiscoSignificance
metacluster_1/pattern_40.0293327
Absent
Stat1
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00311991
Absent
Zscan10
Pattern NameModiscoSignificance
metacluster_1/pattern_10.00838074
Absent
Bcl6
Pattern NameModiscoSignificance
metacluster_1/pattern_10.0516013
Absent
Rest
Pattern NameModiscoSignificance
metacluster_1/pattern_58.12203e-17
Absent
Unique Homer TFs: #40
TF NameModiscoHomer
Klf6Absent
Pattern NameHomerSignificance
motif3.motif0.00593003
Etv1Absent
Pattern NameHomerSignificance
motif2.motif0.000328897
Nfkb2Absent
Pattern NameHomerSignificance
motif17.motif0.00243385
RelaAbsent
Pattern NameHomerSignificance
motif17.motif0.00243385
E2f1Absent
Pattern NameHomerSignificance
motif3.motif0.016267
Zbtb1Absent
Pattern NameHomerSignificance
motif3.motif0.0120853
Zfp637Absent
Pattern NameHomerSignificance
motif22.motif0.013197200000000001
Klf12Absent
Pattern NameHomerSignificance
motif3.motif0.022462299999999998
Klf7Absent
Pattern NameHomerSignificance
motif3.motif0.000584509
Elk3Absent
Pattern NameHomerSignificance
motif2.motif0.000328897
Klf5Absent
Pattern NameHomerSignificance
motif3.motif0.0128643
NfyaAbsent
Pattern NameHomerSignificance
motif7.motif1.38076e-06
Egr1Absent
Pattern NameHomerSignificance
motif3.motif0.022406400000000003
Etv5Absent
Pattern NameHomerSignificance
motif2.motif0.00022091599999999998
NfybAbsent
Pattern NameHomerSignificance
motif7.motif8.673689999999999e-08
Zbtb33Absent
Pattern NameHomerSignificance
motif13.motif0.001478
Sp4Absent
Pattern NameHomerSignificance
motif3.motif0.00639753
Zfp281Absent
Pattern NameHomerSignificance
motif3.motif0.0546092
Nfkb1Absent
Pattern NameHomerSignificance
motif17.motif0.0006308780000000001
Klf1Absent
Pattern NameHomerSignificance
motif3.motif0.054239699999999995
Klf8Absent
Pattern NameHomerSignificance
motif3.motif0.000584509
Zbtb17Absent
Pattern NameHomerSignificance
motif3.motif0.0123068
Ebf1Absent
Pattern NameHomerSignificance
motif14.motif0.0290212
E2f6Absent
Pattern NameHomerSignificance
motif3.motif0.0430585
Klf4Absent
Pattern NameHomerSignificance
motif3.motif0.0430585
Wt1Absent
Pattern NameHomerSignificance
motif3.motif0.0569784
Elk4Absent
Pattern NameHomerSignificance
motif2.motif0.00019988900000000002
E2f4Absent
Pattern NameHomerSignificance
motif3.motif0.00176366
Egr2Absent
Pattern NameHomerSignificance
motif3.motif0.031983199999999996
Sp5Absent
Pattern NameHomerSignificance
motif3.motif0.0210233
NfycAbsent
Pattern NameHomerSignificance
motif7.motif4.76814e-06
RelAbsent
Pattern NameHomerSignificance
motif17.motif0.0443746
E2f7Absent
Pattern NameHomerSignificance
motif3.motif0.022462299999999998
RelbAbsent
Pattern NameHomerSignificance
motif17.motif0.00243385
Pbx3Absent
Pattern NameHomerSignificance
motif7.motif0.000580592
FevAbsent
Pattern NameHomerSignificance
motif2.motif0.00204442
SpdefAbsent
Pattern NameHomerSignificance
motif2.motif0.00763578
Klf15Absent
Pattern NameHomerSignificance
motif3.motif0.022462299999999998
Foxi1Absent
Pattern NameHomerSignificance
motif7.motif0.000707805
Irf9Absent
Pattern NameHomerSignificance
motif2.motif0.059575