# WARNING: this file is not sorted! # db id alt consensus E-value adj_p-value log_adj_p-value bin_location bin_width total_width sites_in_bin total_sites p_success p-value mult_tests 1 TGTGTGTGTGYGTGTGTGTGTGTGTGTGTGT MEME-1 TGTGTGTGTGYGTGTGTGTGTGTGTGTGTGT 2.0e-017 2.6e-020 -45.08 0.0 96 470 70 110 0.20426 1.1e-022 234 1 CRCGCRCRCGCGCRCR MEME-2 CRCGCRCRCGCGCRCR 2.5e-033 3.3e-036 -81.70 0.0 121 485 240 446 0.24948 1.4e-038 242 1 VGSRRRSGMGCGCGGGVGCGVG MEME-4 VGSRRRSGMGCGCGGGVGCGVG 2.0e-003 2.7e-006 -12.82 0.0 111 479 166 479 0.23173 1.1e-008 239 1 TGYRTSTGTGTG MEME-9 TGYRTSTGTGTG 9.5e-009 1.3e-011 -25.09 0.0 83 489 79 201 0.16973 5.2e-014 244 2 GCGCGYBC DREME-1 GCGCGYBC 1.8e-030 2.5e-033 -75.08 0.0 195 493 327 483 0.39554 1.0e-035 246 3 M0189_1.02 (ID2)_(Mus_musculus)_(DBD_0.98) RCACGTGR 5.2e-001 7.0e-004 -7.26 0.0 287 493 343 503 0.58215 2.9e-006 246 3 M0196_1.02 (NPAS2)_(Mus_musculus)_(DBD_1.00) NSCACGTGTN 1.6e-001 2.2e-004 -8.44 0.0 163 491 203 460 0.33198 8.8e-007 245 3 M0198_1.02 (SOHLH2)_(Mus_musculus)_(DBD_0.84) NGYVCGTGCN 7.6e-004 1.0e-006 -13.80 0.0 135 491 226 586 0.27495 4.1e-009 245 3 M0211_1.02 (MLXIP)_(Mus_musculus)_(DBD_0.82) BCACGTGK 1.1e-002 1.5e-005 -11.12 0.0 53 493 92 483 0.10751 6.0e-008 246 3 M0212_1.02 (TCFL5)_(Mus_musculus)_(DBD_1.00) NBCDCGHGVN 1.7e-006 2.3e-009 -19.90 0.0 355 491 475 563 0.72301 9.3e-012 245 3 M0413_1.02 (ZBTB1)_(Mus_musculus)_(DBD_0.99) NDTGCGKGDN 9.2e0000 1.2e-002 -4.40 0.0 129 491 198 588 0.26273 5.0e-005 245 3 M0432_1.02 (ZFP161)_(Mus_musculus)_(DBD_1.00) NNCGYGCHH 1.8e-017 2.4e-020 -45.17 0.0 132 492 261 566 0.26829 9.8e-023 245 3 M2305_1.02 NRF1 YGCGCABGCGC 1.4e-013 1.9e-016 -36.18 0.0 114 490 229 574 0.23265 7.9e-019 244 3 M4532_1.02 MYC CCACGTGSYB 1.5e-001 2.1e-004 -8.48 0.0 289 491 378 549 0.58859 8.4e-007 245 3 M4536_1.02 E2F1 VRRVRGVGCGCGCRS 9.4e-014 1.3e-016 -36.61 0.0 182 486 332 599 0.37449 5.2e-019 242 3 M5493_1.02 GMEB2 KTRCGTAA 2.4e-001 3.2e-004 -8.05 0.0 113 493 169 528 0.22921 1.3e-006 246 3 M5509_1.02 HEY1 GGCACGTGBC 3.1e-006 4.2e-009 -19.30 0.0 71 491 128 491 0.14460 1.7e-011 245 3 M5883_1.02 TBX20 TCACACSTTCACACCT 1.2e0000 1.6e-003 -6.42 0.0 101 485 90 279 0.20825 6.7e-006 242 3 M5981_1.02 ZSCAN4 TTTTCAGKGTGTGCA 5.6e-001 7.5e-004 -7.20 0.0 34 486 42 278 0.06996 3.1e-006 242 3 M6139_1.02 AHR KCACGCRAH 1.1e-011 1.5e-014 -31.85 0.0 78 492 173 583 0.15854 6.0e-017 245 3 M6151_1.02 ARNT BYRCGTGC 2.5e-012 3.4e-015 -33.31 0.0 135 493 251 571 0.27383 1.4e-017 246 3 M6192_1.02 E2F3 SSCGCSAAAC 1.4e0000 1.9e-003 -6.25 0.0 163 491 241 575 0.33198 7.9e-006 245 3 M6200_1.02 EGR3 WGAGTGGGYGT 3.7e-001 4.9e-004 -7.62 0.0 70 490 120 556 0.14286 2.0e-006 244 3 M6210_1.02 ENO1 YDSMCACRTGSYS 7.4e0000 9.9e-003 -4.62 0.0 52 488 92 566 0.10656 4.1e-005 243 3 M6212_1.02 EPAS1 CMCACGYAYDCAC 1.9e-009 2.5e-012 -26.72 0.0 90 488 176 549 0.18443 1.0e-014 243 3 M6273_1.02 HEY2 GBBGGCWCGTGGCHTBV 4.6e-002 6.2e-005 -9.69 0.0 172 484 244 526 0.35537 2.6e-007 241 3 M6275_1.02 HIF1A SBSTACGTGCSB 1.1e-013 1.4e-016 -36.48 0.0 81 489 179 565 0.16564 5.9e-019 244 3 M6352_1.02 MYCN CCACGTGS 6.4e-003 8.6e-006 -11.66 0.0 63 493 116 551 0.12779 3.5e-008 246 3 M6535_1.02 WT1 GMGGGGGCGKGGG 2.1e0000 2.7e-003 -5.90 0.0 140 488 220 598 0.28689 1.1e-005 243 ## # Detailed descriptions of columns in this file: # # db: The name of the database (file name) that contains the motif. # id: A name for the motif that is unique in the motif database file. # alt: An alternate name of the motif that may be provided # in the motif database file. # consensus: A consensus sequence computed from the motif. # E-value: The expected number motifs that would have least one. # region as enriched for best matches to the motif as the reported region. # The E-value is the p-value multiplied by the number of motifs in the # input database(s). # adj_p-value: The probability that any tested region would be as enriched for # best matches to this motif as the reported region is. # By default the p-value is calculated by using the one-tailed binomial # test on the number of sequences with a match to the motif # that have their best match in the reported region, corrected for # the number of regions and score thresholds tested. # The test assumes that the probability that the best match in a sequence # falls in the region is the region width divided by the # number of places a motif # can align in the sequence (sequence length minus motif width plus 1). # When CentriMo is run in discriminative mode with a negative # set of sequences, the p-value of a region is calculated # using the Fisher exact test on the # enrichment of best matches in the positive sequences relative # to the negative sequences, corrected # for the number of regions and score thresholds tested. # The test assumes that the probability that the best match (if any) # falls into a given region # is the same for all positive and negative sequences. # log_adj_p-value: Log of adjusted p-value. # bin_location: Location of the center of the most enriched region. # bin_width: The width (in sequence positions) of the most enriched region. # A best match to the motif is counted as being in the region if the # center of the motif falls in the region. # total_width: The window maximal size which can be reached for this motif: # rounded(sequence length - motif length +1)/2 # sites_in_bin: The number of (positive) sequences whose best match to the motif # falls in the reported region. # Note: This number may be less than the number of # (positive) sequences that have a best match in the region. # The reason for this is that a sequence may have many matches that score # equally best. # If n matches have the best score in a sequence, 1/n is added to the # appropriate bin for each match. # total_sites: The number of sequences containing a match to the motif # above the score threshold. # p_success: The probability of falling in the enriched window: # bin width / total width # p-value: The uncorrected p-value before it gets adjusted to the # number of multiple tests to give the adjusted p-value. # mult_tests: This is the number of multiple tests (n) done for this motif. # It was used to correct the original p-value of a region for # multiple tests using the formula: # p' = 1 - (1-p)^n where p is the uncorrected p-value. # The number of multiple tests is the number of regions # considered times the number of score thresholds considered. # It depends on the motif length, sequence length, and the type of # optimizations being done (central enrichment, local enrichment, # score optimization).