In [1]:
# Parameters
data_name = "apThEff;CD4+;8wks;D;En."
modisco_root = "/srv/scratch/msharmin/mouse_hem/with_tfd/full_mouse50/Naive_modisco2019"
tomtom_report_root = "http://mitra.stanford.edu/kundaje/msharmin/report/tomtom_outs/cells"
task_dir = "task_237-naivegw"
perf_file = "/srv/scratch/msharmin/mouse_hem/with_tfd/full_mouse50/fineFactorized/task_237-naivegw/NaiveauPRC.txt"
homer_root = "/srv/scratch/msharmin/mouse_hem/with_tfd/full_mouse50/Naive_scans"
reportfile = "/mnt/lab_data/kundaje/msharmin/annotations/filtering samples_MS2.xlsx"
sheetname = "filter23"
In [2]:
from matlas.reports import display_metadata
load data from labcluster
Using TensorFlow backend.
2019-08-30 19:33:49,544 [WARNING] git-lfs not installed
In [3]:
display_metadata(data_name, perf_file, reportfile, sheetname)
    Sample Information
    MetaData NameDescription
    Cell typeCD4-positive helper(adult-8wks, Activated primary CD4 effector cells)
    Cell GroupT cells
    Experiment NameDHS
    Experiment GroupENCODE
    Pipeline Output
    replicateNaïve overlap peaksIDR peaksTSS enrichment (< 8 is very poor <10 is low)Final number of unique mapping, dup-filtered, chrM filtered readsNumber of reads in called peak regionsFraction of reads in called peak regionsNumber of reads in promoter regionsFraction of reads in promoter regionsNumber of reads in enhancer regionsFraction of reads in enhancer regions
    rep11045958241110.566389786849192405600.2143129254990.144324285260.3613
    rep21045958241112.48381963619633465040.170529652950.151167922270.3462
    Modelling Metadata
    MetricValue
    auPRC0.6235
    Calibrated Recall at 50% FDR0.184
    Number of Positive Examples in Test Data104435
    Number of Negative Examples in Test Data7966416
    Imbalance Ratio in Test Data0.0129
    Test Chromosomeschr2, chr3, chr19
In [4]:
from matlas.reports import display_paiwise_pattern_comparison
from matlas.reports import display_denovo_patterns
In [5]:
display_denovo_patterns(data_name, modisco_root=modisco_root)
TF-MoDISco is using the TensorFlow backend.
The following two links show list of Denovo Patterns and corresponding Motifs discovered by TF-MoDISco
Click here for Denovo Patterns by TF-MoDISco: #8
Pattern NameTF Name(s)Modisco
metacluster_1/pattern_0 # seqlets: 13074 SequenceContrib ScoresHyp_Contrib Scores
Ctcf, Ctcfl
metacluster_1/pattern_1 # seqlets: 1480 SequenceContrib ScoresHyp_Contrib Scores
metacluster_1/pattern_2 # seqlets: 767 SequenceContrib ScoresHyp_Contrib Scores
Irf1, Irf2, Irf8, Stat1, Irf7, Prdm1, Irf4, Stat2, Spib, Irf3,

Irf9, Batf3
metacluster_1/pattern_3 # seqlets: 560 SequenceContrib ScoresHyp_Contrib Scores
Ets1, Etv2, Spi1, Elk1, Erg, Etv4, Gabpa, Ehf, Elk4, Elf1,

Etv6, Elf3, Fli1, Elk3, Etv3, Etv1, XP_911724.4, Etv5, Elf4, Spic, Elf2,

Fev, Elf5, Spdef
metacluster_1/pattern_4 # seqlets: 436 SequenceContrib ScoresHyp_Contrib Scores
Zfp143, Thap11, Tbx2, Stat3
metacluster_1/pattern_5 # seqlets: 314 SequenceContrib ScoresHyp_Contrib Scores
Sp2, Sp1, Sp3, Sp4, Klf3, Klf6, E2f4, Wt1, Klf1, Klf7,

Zbtb17, Egr1, Klf4, Klf8, Klf12, Sp5, Zfp281, Klf5, Klf15, Zfx, Maz,

Egr2
metacluster_1/pattern_6 # seqlets: 41 SequenceContrib ScoresHyp_Contrib Scores
Rest
metacluster_1/pattern_7 # seqlets: 39 SequenceContrib ScoresHyp_Contrib Scores
Nrf1
Click here for Motifs by TF-MoDISco: #66
TF NamePattern(s)
Ctcf
Pattern NameModiscoSignificance
metacluster_1/pattern_02.1718700000000002e-14
metacluster_1/pattern_11.58529e-05
metacluster_1/pattern_50.0407911
Ctcfl
Pattern NameModiscoSignificance
metacluster_1/pattern_04.80623e-07
metacluster_1/pattern_10.00145013
metacluster_1/pattern_50.00253012
Irf1
Pattern NameModiscoSignificance
metacluster_1/pattern_22.48585e-13
Irf2
Pattern NameModiscoSignificance
metacluster_1/pattern_20.00491771
Irf8
Pattern NameModiscoSignificance
metacluster_1/pattern_29.92307e-06
metacluster_1/pattern_30.023633900000000003
Stat1
Pattern NameModiscoSignificance
metacluster_1/pattern_21.07376e-05
Irf7
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000142976
Prdm1
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000205209
metacluster_1/pattern_30.0220757
Irf4
Pattern NameModiscoSignificance
metacluster_1/pattern_20.0551518
metacluster_1/pattern_30.0244081
Stat2
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000305533
Spib
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000406615
metacluster_1/pattern_30.00288229
Irf3
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000460745
metacluster_1/pattern_30.0421401
Irf9
Pattern NameModiscoSignificance
metacluster_1/pattern_20.00222114
Batf3
Pattern NameModiscoSignificance
metacluster_1/pattern_20.0138506
Ets1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00168087
Etv2
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00033729699999999997
Spi1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.04609
Elk1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0020074999999999997
Erg
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00168087
Etv4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00232394
Gabpa
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00338603
Ehf
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0244081
Elk4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00232394
Elf1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00120169
Etv6
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00315535
Elf3
Pattern NameModiscoSignificance
metacluster_1/pattern_30.04877430000000001
Fli1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00212721
Elk3
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00303999
Etv3
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00232394
Etv1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00239281
XP_911724.4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00239281
Etv5
Pattern NameModiscoSignificance
metacluster_1/pattern_30.003164
Elf4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00635401
Spic
Pattern NameModiscoSignificance
metacluster_1/pattern_30.010715299999999999
Elf2
Pattern NameModiscoSignificance
metacluster_1/pattern_30.016562999999999998
Fev
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0120766
Elf5
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0411795
Spdef
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0411795
Zfp143
Pattern NameModiscoSignificance
metacluster_1/pattern_49.620549999999999e-20
Thap11
Pattern NameModiscoSignificance
metacluster_1/pattern_42.95517e-18
Tbx2
Pattern NameModiscoSignificance
metacluster_1/pattern_47.455800000000001e-14
Stat3
Pattern NameModiscoSignificance
metacluster_1/pattern_40.0588467
Sp2
Pattern NameModiscoSignificance
metacluster_1/pattern_56.17989e-10
metacluster_1/pattern_70.00143435
Sp1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000676622
metacluster_1/pattern_70.00375012
Sp3
Pattern NameModiscoSignificance
metacluster_1/pattern_55.746650000000001e-07
metacluster_1/pattern_70.0120878
Sp4
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00252998
Klf3
Pattern NameModiscoSignificance
metacluster_1/pattern_52.19335e-06
Klf6
Pattern NameModiscoSignificance
metacluster_1/pattern_52.61041e-05
E2f4
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000130148
Wt1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000374853
Klf1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0309822
Klf7
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000805065
Zbtb17
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000926653
Egr1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0346306
Klf4
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00161844
Klf8
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00161844
Klf12
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00161844
Sp5
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00485331
Zfp281
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00923896
Klf5
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00824124
Klf15
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0112375
Zfx
Pattern NameModiscoSignificance
metacluster_1/pattern_50.015205000000000002
metacluster_1/pattern_70.030105700000000003
Maz
Pattern NameModiscoSignificance
metacluster_1/pattern_50.018083000000000002
Egr2
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0217053
Rest
Pattern NameModiscoSignificance
metacluster_1/pattern_61.11399e-11
Nrf1
Pattern NameModiscoSignificance
metacluster_1/pattern_71.34852e-06
In [6]:
display_paiwise_pattern_comparison(data_name, modisco_root, homer_root)
Number of CISBP TFs obtained by TF-MoDISco and Homer
Shared TFs between TF-MoDISco and Homer: #48
TF NameModiscoHomer
Ehf
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0244081
Pattern NameHomerSignificance
motif3.motif0.013677200000000002
Fli1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00212721
Pattern NameHomerSignificance
motif3.motif0.000429456
Nrf1
Pattern NameModiscoSignificance
metacluster_1/pattern_71.34852e-06
Pattern NameHomerSignificance
motif6.motif6.72522e-08
Etv4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00232394
Pattern NameHomerSignificance
motif3.motif0.00890324
Ets1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00168087
Pattern NameHomerSignificance
motif3.motif0.00047255199999999994
Gabpa
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00338603
Pattern NameHomerSignificance
motif3.motif0.00047255199999999994
Klf6
Pattern NameModiscoSignificance
metacluster_1/pattern_52.61041e-05
Pattern NameHomerSignificance
motif2.motif0.00331672
motif22.motif0.030216800000000002
Elf3
Pattern NameModiscoSignificance
metacluster_1/pattern_30.04877430000000001
Pattern NameHomerSignificance
motif3.motif0.0415656
Thap11
Pattern NameModiscoSignificance
metacluster_1/pattern_42.95517e-18
Pattern NameHomerSignificance
motif5.motif1.7407499999999998e-06
Spic
Pattern NameModiscoSignificance
metacluster_1/pattern_30.010715299999999999
Pattern NameHomerSignificance
motif3.motif0.0453582
Spdef
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0411795
Pattern NameHomerSignificance
motif3.motif0.0576961
Sp2
Pattern NameModiscoSignificance
metacluster_1/pattern_56.17989e-10
metacluster_1/pattern_70.00143435
Pattern NameHomerSignificance
motif2.motif0.00112055
motif22.motif0.04851830000000001
Elk3
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00303999
Pattern NameHomerSignificance
motif3.motif0.000429456
Ctcfl
Pattern NameModiscoSignificance
metacluster_1/pattern_04.80623e-07
metacluster_1/pattern_10.00145013
metacluster_1/pattern_50.00253012
Pattern NameHomerSignificance
motif1.motif0.000465115
Elk1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0020074999999999997
Pattern NameHomerSignificance
motif3.motif0.000429456
Etv6
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00315535
Pattern NameHomerSignificance
motif3.motif0.024675299999999997
Ctcf
Pattern NameModiscoSignificance
metacluster_1/pattern_02.1718700000000002e-14
metacluster_1/pattern_11.58529e-05
metacluster_1/pattern_50.0407911
Pattern NameHomerSignificance
motif1.motif1.12519e-05
Etv2
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00033729699999999997
Pattern NameHomerSignificance
motif3.motif0.00392214
Elk4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00232394
Pattern NameHomerSignificance
motif3.motif0.000429456
Zfp143
Pattern NameModiscoSignificance
metacluster_1/pattern_49.620549999999999e-20
Pattern NameHomerSignificance
motif5.motif3.97668e-06
Klf12
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00161844
Pattern NameHomerSignificance
motif2.motif0.041445300000000004
Sp4
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00252998
Pattern NameHomerSignificance
motif2.motif0.026854000000000003
motif20.motif0.05320419999999999
motif22.motif0.04851830000000001
Elf2
Pattern NameModiscoSignificance
metacluster_1/pattern_30.016562999999999998
Pattern NameHomerSignificance
motif3.motif0.041908999999999995
Fev
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0120766
Pattern NameHomerSignificance
motif3.motif0.000539822
E2f4
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000130148
Pattern NameHomerSignificance
motif2.motif0.00112055
Sp1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000676622
metacluster_1/pattern_70.00375012
Pattern NameHomerSignificance
motif2.motif0.00856542
motif20.motif0.033962
motif22.motif0.0392446
Etv3
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00232394
Pattern NameHomerSignificance
motif3.motif0.000539822
Klf7
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000805065
Pattern NameHomerSignificance
motif2.motif0.00112055
Egr1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0346306
Pattern NameHomerSignificance
motif2.motif0.026854000000000003
motif22.motif0.04851830000000001
Klf4
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00161844
Pattern NameHomerSignificance
motif2.motif0.00112055
motif22.motif0.04851830000000001
Spi1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.04609
Pattern NameHomerSignificance
motif3.motif0.0576961
Elf5
Pattern NameModiscoSignificance
metacluster_1/pattern_30.0411795
Pattern NameHomerSignificance
motif3.motif0.053413800000000004
Elf4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00635401
Pattern NameHomerSignificance
motif3.motif0.00311248
Erg
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00168087
Pattern NameHomerSignificance
motif3.motif0.000429456
Tbx2
Pattern NameModiscoSignificance
metacluster_1/pattern_47.455800000000001e-14
Pattern NameHomerSignificance
motif5.motif6.94325e-05
Klf8
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00161844
Pattern NameHomerSignificance
motif2.motif0.00125266
Irf2
Pattern NameModiscoSignificance
metacluster_1/pattern_20.00491771
Pattern NameHomerSignificance
motif14.motif0.0014107999999999998
Elf1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00120169
Pattern NameHomerSignificance
motif3.motif0.00262744
XP_911724.4
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00239281
Pattern NameHomerSignificance
motif3.motif0.000539822
Sp3
Pattern NameModiscoSignificance
metacluster_1/pattern_55.746650000000001e-07
metacluster_1/pattern_70.0120878
Pattern NameHomerSignificance
motif2.motif0.00112055
motif22.motif0.04851830000000001
Irf7
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000142976
Pattern NameHomerSignificance
motif14.motif0.0235904
Spib
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000406615
metacluster_1/pattern_30.00288229
Pattern NameHomerSignificance
motif3.motif0.0576961
Etv5
Pattern NameModiscoSignificance
metacluster_1/pattern_30.003164
Pattern NameHomerSignificance
motif3.motif3.06468e-05
Klf1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0309822
Pattern NameHomerSignificance
motif2.motif0.0239428
motif22.motif0.0392446
Klf3
Pattern NameModiscoSignificance
metacluster_1/pattern_52.19335e-06
Pattern NameHomerSignificance
motif2.motif0.00112055
motif22.motif0.0375219
Zfx
Pattern NameModiscoSignificance
metacluster_1/pattern_50.015205000000000002
metacluster_1/pattern_70.030105700000000003
Pattern NameHomerSignificance
motif2.motif0.0239486
Klf5
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00824124
Pattern NameHomerSignificance
motif2.motif0.0416445
Etv1
Pattern NameModiscoSignificance
metacluster_1/pattern_30.00239281
Pattern NameHomerSignificance
motif3.motif0.000447431
Unique TF-MoDISco TFs: #18
TF NameModiscoHomer
Maz
Pattern NameModiscoSignificance
metacluster_1/pattern_50.018083000000000002
Absent
Stat2
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000305533
Absent
Stat3
Pattern NameModiscoSignificance
metacluster_1/pattern_40.0588467
Absent
Prdm1
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000205209
metacluster_1/pattern_30.0220757
Absent
Irf9
Pattern NameModiscoSignificance
metacluster_1/pattern_20.00222114
Absent
Irf3
Pattern NameModiscoSignificance
metacluster_1/pattern_20.000460745
metacluster_1/pattern_30.0421401
Absent
Egr2
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0217053
Absent
Irf1
Pattern NameModiscoSignificance
metacluster_1/pattern_22.48585e-13
Absent
Irf4
Pattern NameModiscoSignificance
metacluster_1/pattern_20.0551518
metacluster_1/pattern_30.0244081
Absent
Irf8
Pattern NameModiscoSignificance
metacluster_1/pattern_29.92307e-06
metacluster_1/pattern_30.023633900000000003
Absent
Zfp281
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00923896
Absent
Batf3
Pattern NameModiscoSignificance
metacluster_1/pattern_20.0138506
Absent
Wt1
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000374853
Absent
Sp5
Pattern NameModiscoSignificance
metacluster_1/pattern_50.00485331
Absent
Zbtb17
Pattern NameModiscoSignificance
metacluster_1/pattern_50.000926653
Absent
Rest
Pattern NameModiscoSignificance
metacluster_1/pattern_61.11399e-11
Absent
Stat1
Pattern NameModiscoSignificance
metacluster_1/pattern_21.07376e-05
Absent
Klf15
Pattern NameModiscoSignificance
metacluster_1/pattern_50.0112375
Absent
Unique Homer TFs: #34
TF NameModiscoHomer
MlxAbsent
Pattern NameHomerSignificance
motif19.motif0.0241323
RelaAbsent
Pattern NameHomerSignificance
motif21.motif0.010885700000000002
CremAbsent
Pattern NameHomerSignificance
motif19.motif0.0511725
ClockAbsent
Pattern NameHomerSignificance
motif19.motif0.048395999999999995
NfybAbsent
Pattern NameHomerSignificance
motif7.motif2.0141999999999999e-07
MitfAbsent
Pattern NameHomerSignificance
motif19.motif0.00993053
E2f1Absent
Pattern NameHomerSignificance
motif22.motif0.04851830000000001
ArntAbsent
Pattern NameHomerSignificance
motif19.motif0.0054804
ArntlAbsent
Pattern NameHomerSignificance
motif19.motif0.035073400000000005
Pbx3Absent
Pattern NameHomerSignificance
motif7.motif0.000625468
Creb1Absent
Pattern NameHomerSignificance
motif19.motif0.0398702
Creb3l2Absent
Pattern NameHomerSignificance
motif19.motif0.0500815
Bhlhe40Absent
Pattern NameHomerSignificance
motif19.motif0.010287200000000002
Usf1Absent
Pattern NameHomerSignificance
motif19.motif0.046626
Srebf2Absent
Pattern NameHomerSignificance
motif19.motif0.00796408
Zbtb33Absent
Pattern NameHomerSignificance
motif13.motif0.00126246
Taf1Absent
Pattern NameHomerSignificance
motif16.motif0.0168883
Foxi1Absent
Pattern NameHomerSignificance
motif7.motif0.00150142
Bhlhe41Absent
Pattern NameHomerSignificance
motif19.motif0.022416400000000003
Npas2Absent
Pattern NameHomerSignificance
motif19.motif0.0535021
Nfkb1Absent
Pattern NameHomerSignificance
motif21.motif0.00650009
Srebf1Absent
Pattern NameHomerSignificance
motif19.motif0.0291
Tfe3Absent
Pattern NameHomerSignificance
motif19.motif0.00796408
RelAbsent
Pattern NameHomerSignificance
motif21.motif0.0267916
RelbAbsent
Pattern NameHomerSignificance
motif21.motif0.0267916
Yy1Absent
Pattern NameHomerSignificance
motif16.motif3.77637e-05
NfycAbsent
Pattern NameHomerSignificance
motif7.motif1.19538e-05
RxrgAbsent
Pattern NameHomerSignificance
motif19.motif0.0586366
Usf2Absent
Pattern NameHomerSignificance
motif19.motif0.0500815
TfebAbsent
Pattern NameHomerSignificance
motif19.motif0.00289784
Nfkb2Absent
Pattern NameHomerSignificance
motif21.motif0.00525492
TfecAbsent
Pattern NameHomerSignificance
motif19.motif0.00882185
NfyaAbsent
Pattern NameHomerSignificance
motif7.motif8.678989999999999e-06
Zfp42Absent
Pattern NameHomerSignificance
motif16.motif0.00146893