The following figures and statistics compare the called hits with the seqlets used by TF-MoDISco to construct each motif.
This figure shows the number of hits called vs. the number of TF-MoDISco seqlets identified for each motif. The dashed line is the identity line. When comparing a shared set of regions, the hit counts should be mostly greater than the corresponding seqlet counts, since TF-MoDISco stringently filters seqlets and usually uses a smaller input window.
For each motif, this table examines the consistency between hits and TF-MoDISco seqlets.
The following statistics report the number of hits, seqlets, and their relationships:
Note that the seqlet counts here may be lower than those shown in the tfmodisco-lite report due to double-counting in overlapping regions. The seqlet counts shown here are unique while the counts in the tfmodisco-lite report are not de-duplicated.
Note that palindromic motifs may have lower recall due to disagreements on orientation.
If seqlet recall is near zero for all motifs, the -W/--modisco-region-width argument is likely incorrect.
This value is required to infer genomic coordinates of seqlets from the tfmodisco-lite output H5.
Motif CWMs (contribution weight matrices) are average contribution scores over a set of regions. The CWMs plotted here are:
The plots span the full untrimmed motif, with the trimmed motif shaded.
The hit-seqlet similarity is the cosine similarity between the additional-restricted-hits CWM and the seqlet CWM. This statistic measures the similarity between hits that were missed by TF-MoDISco and the seqlets used to construct the motif.
| Motif Name | Seqlet Recall | Hit-Seqlet CWM Similarity | Hits | Restricted Hits | Seqlets | Hit/Seqlet Overlaps | Missed Seqlets | Additional Restricted Hits | Hit CWM (FC) | Hit CWM (RC) | TF-MoDISco CWM (FC) | TF-MoDISco CWM (RC) | Missed-Seqlet-Only CWM | Additional-Restricted-Hit CWM |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pos_patterns.pattern_0 |
0.001 | 1.000 | 38329 | 32009 | 18384 | 11 | 18373 | 31998 | ||||||
pos_patterns.pattern_1 |
0.003 | 0.992 | 450484 | 302605 | 11725 | 35 | 11690 | 302570 | ||||||
pos_patterns.pattern_2 |
0.001 | 0.998 | 164078 | 115823 | 9893 | 13 | 9880 | 115810 | ||||||
pos_patterns.pattern_3 |
0.001 | 0.998 | 104270 | 64751 | 7933 | 6 | 7927 | 64745 | ||||||
pos_patterns.pattern_4 |
0.001 | 0.961 | 252218 | 153012 | 6362 | 7 | 6355 | 153005 | ||||||
pos_patterns.pattern_5 |
0.001 | 0.994 | 152185 | 103188 | 6204 | 5 | 6199 | 103183 | ||||||
pos_patterns.pattern_6 |
0.000 | 0.995 | 47585 | 30132 | 3467 | 0 | 3467 | 30132 | ||||||
pos_patterns.pattern_7 |
0.001 | 0.998 | 22849 | 16888 | 3320 | 3 | 3317 | 16885 | ||||||
pos_patterns.pattern_8 |
0.000 | 0.963 | 46555 | 33463 | 2225 | 1 | 2224 | 33462 | ||||||
pos_patterns.pattern_9 |
0.001 | 0.992 | 17666 | 12572 | 1498 | 1 | 1497 | 12571 | ||||||
pos_patterns.pattern_10 |
0.000 | 0.939 | 137905 | 86128 | 1465 | 0 | 1465 | 86128 | ||||||
pos_patterns.pattern_11 |
0.000 | 0.999 | 16079 | 11019 | 1327 | 0 | 1327 | 11019 | ||||||
pos_patterns.pattern_12 |
0.000 | 1.000 | 2083 | 1654 | 1087 | 0 | 1087 | 1654 | ||||||
pos_patterns.pattern_13 |
0.000 | 0.998 | 6104 | 3659 | 903 | 0 | 903 | 3659 | ||||||
pos_patterns.pattern_14 |
0.000 | 0.981 | 99036 | 65837 | 866 | 0 | 866 | 65837 | ||||||
pos_patterns.pattern_15 |
0.000 | 0.985 | 57470 | 36443 | 703 | 0 | 703 | 36443 | ||||||
pos_patterns.pattern_16 |
0.000 | 0.993 | 9404 | 6223 | 665 | 0 | 665 | 6223 | ||||||
pos_patterns.pattern_17 |
0.000 | 0.996 | 6380 | 4429 | 621 | 0 | 621 | 4429 | ||||||
pos_patterns.pattern_18 |
0.000 | 0.998 | 9974 | 7104 | 581 | 0 | 581 | 7104 | ||||||
pos_patterns.pattern_19 |
0.000 | 0.987 | 7096 | 4619 | 534 | 0 | 534 | 4619 | ||||||
pos_patterns.pattern_20 |
0.000 | 0.896 | 65435 | 44988 | 325 | 0 | 325 | 44988 | ||||||
pos_patterns.pattern_21 |
0.000 | 0.966 | 4447 | 2745 | 262 | 0 | 262 | 2745 | ||||||
pos_patterns.pattern_22 |
0.000 | 0.989 | 6739 | 4402 | 212 | 0 | 212 | 4402 | ||||||
pos_patterns.pattern_23 |
0.000 | 0.993 | 3428 | 2300 | 188 | 0 | 188 | 2300 | ||||||
pos_patterns.pattern_24 |
0.000 | 0.977 | 7713 | 5828 | 148 | 0 | 148 | 5828 | ||||||
pos_patterns.pattern_25 |
0.000 | 0.962 | 1264 | 913 | 121 | 0 | 121 | 913 | ||||||
pos_patterns.pattern_26 |
0.000 | 0.997 | 676 | 610 | 100 | 0 | 100 | 610 | ||||||
pos_patterns.pattern_27 |
0.000 | 0.973 | 3150 | 2194 | 74 | 0 | 74 | 2194 | ||||||
pos_patterns.pattern_28 |
0.000 | 0.960 | 3269 | 2399 | 62 | 0 | 62 | 2399 | ||||||
pos_patterns.pattern_29 |
0.000 | 0.900 | 1581 | 958 | 21 | 0 | 21 | 958 | ||||||
neg_patterns.pattern_0 |
0.000 | 0.974 | 31438 | 15129 | 38 | 0 | 38 | 15129 | ||||||
neg_patterns.pattern_1 |
0.000 | 0.901 | 2385 | 903 | 29 | 0 | 29 | 903 | ||||||
neg_patterns.pattern_2 |
0.000 | 0.952 | 1220569 | 695892 | 28 | 0 | 28 | 695892 |
This heatmap shows the prevalence of motifs whose (untrimmed) hits overlap with TF-MoDISco seqlets of other motifs. The vertical axis shows the motif of the seqlet, while the horizontal axis shows the motif of the hit. The color intensity here represents an estimator of the expected number of bases of hit overlap per base of seqlet.
The following figures visualize the distribution of hit statistics across motifs and regions.
This plot shows the distribution of hit counts per region for any motif. The number of regions with no hits should be near zero.
These plots show the distribution of hit statistics for each motif, specifically:
| Motif Name | Hits Per Region | Hit Coefficient | Hit Similarity | Hit Importance |
|---|---|---|---|---|
pos_patterns.pattern_0 |
||||
pos_patterns.pattern_1 |
||||
pos_patterns.pattern_2 |
||||
pos_patterns.pattern_3 |
||||
pos_patterns.pattern_4 |
||||
pos_patterns.pattern_5 |
||||
pos_patterns.pattern_6 |
||||
pos_patterns.pattern_7 |
||||
pos_patterns.pattern_8 |
||||
pos_patterns.pattern_9 |
||||
pos_patterns.pattern_10 |
||||
pos_patterns.pattern_11 |
||||
pos_patterns.pattern_12 |
||||
pos_patterns.pattern_13 |
||||
pos_patterns.pattern_14 |
||||
pos_patterns.pattern_15 |
||||
pos_patterns.pattern_16 |
||||
pos_patterns.pattern_17 |
||||
pos_patterns.pattern_18 |
||||
pos_patterns.pattern_19 |
||||
pos_patterns.pattern_20 |
||||
pos_patterns.pattern_21 |
||||
pos_patterns.pattern_22 |
||||
pos_patterns.pattern_23 |
||||
pos_patterns.pattern_24 |
||||
pos_patterns.pattern_25 |
||||
pos_patterns.pattern_26 |
||||
pos_patterns.pattern_27 |
||||
pos_patterns.pattern_28 |
||||
pos_patterns.pattern_29 |
||||
neg_patterns.pattern_0 |
||||
neg_patterns.pattern_1 |
||||
neg_patterns.pattern_2 |
This heatmap shows the co-occurrence of motifs across regions. The color intensity here represents the cosine similarity between the motifs' occurrence across regions, where occurence is defined as the presence of a hit for a motif in a region.