# WARNING: this file is not sorted! # db id alt consensus E-value adj_p-value log_adj_p-value bin_location bin_width total_width sites_in_bin total_sites p_success p-value mult_tests 1 SVBCWCCCCDGGGDSY MEME-1 SVBCWCCCCDGGGDSY 2.7e-086 3.6e-089 -203.66 0.0 79 485 312 593 0.16289 1.5e-091 242 1 CCTSCTG MEME-3 CCTSCTG 8.8e-009 1.2e-011 -25.16 0.0 96 494 111 289 0.19433 4.8e-014 246 1 TYTKACASDWDYTWWWNTYTTYTCTGBRMTYKAATCSYMMHKDWATKSVT MEME-7 TYTKACASDWDYTWWWNTYTTYTCTGBRMTYKAATCSYMMHKDWATKSVT 6.6e0000 8.9e-003 -4.73 0.0 49 451 7 11 0.10865 4.0e-005 225 1 ACRCMKACTGGHWGKGACTKGWNWTWDATATTTAMCMABKTD MEME-10 ACRCMKACTGGHWGKGACTKGWNWTWDATATTTAMCMABKTD 1.9e0000 2.5e-003 -6.00 0.0 33 459 5 6 0.07190 1.1e-005 229 2 CCCHRGGG DREME-1 CCCHRGGG 3.5e-044 4.6e-047 -106.69 0.0 77 493 173 350 0.15619 1.9e-049 246 2 CAGSAGGY DREME-2 CAGSAGGY 5.8e-004 7.7e-007 -14.07 0.0 91 493 63 166 0.18458 3.1e-009 246 2 GCRCCC DREME-3 GCRCCC 3.1e-015 4.2e-018 -40.02 0.0 151 495 243 479 0.30505 1.7e-020 247 2 CHGGGA DREME-4 CHGGGA 3.8e-003 5.0e-006 -12.20 0.0 129 495 202 548 0.26061 2.0e-008 247 3 M0085_1.02 (TFAP2E)_(Mus_musculus)_(DBD_0.99) THGCCYSVGG 1.0e-004 1.4e-007 -15.79 0.0 171 491 280 595 0.34827 5.7e-010 245 3 M0422_1.02 (ZIC5)_(Mus_musculus)_(DBD_0.99) SYRGGGGGTM 9.1e-011 1.2e-013 -29.74 0.0 103 491 212 600 0.20978 5.0e-016 245 3 M0432_1.02 (ZFP161)_(Mus_musculus)_(DBD_1.00) NNCGYGCHH 3.1e0000 4.2e-003 -5.47 0.0 332 492 298 386 0.67480 1.7e-005 245 3 M1838_1.02 TFAP2A NHBDGCCYSAGGGCA 8.3e-003 1.1e-005 -11.41 0.0 232 486 340 577 0.47737 4.6e-008 242 3 M1968_1.02 EBF1 TCCCWGGGGRV 1.7e-033 2.3e-036 -82.06 0.0 92 490 248 591 0.18776 9.4e-039 244 3 M2264_1.02 (ATOH1)_(Mus_musculus)_(DBD_1.00) RCCAKCTG 5.4e0000 7.2e-003 -4.94 0.0 145 493 212 567 0.29412 2.9e-005 246 3 M4427_1.02 CTCF NYGGCCASCAGRKGGCRSYVB 1.7e-004 2.3e-007 -15.31 0.0 212 480 299 521 0.44167 9.4e-010 239 3 M4525_1.02 TFAP2C NGCCYSAGGSCANDB 1.8e-005 2.3e-008 -17.57 0.0 180 486 292 582 0.37037 9.7e-011 242 3 M4536_1.02 E2F1 VRRVRGVGCGCGCRS 2.3e-002 3.1e-005 -10.39 0.0 218 486 304 543 0.44856 1.3e-007 242 3 M4612_1.02 CTCFL CCRSCAGGGGGCGCY 1.6e-009 2.2e-012 -26.85 0.0 186 486 296 541 0.38272 9.0e-015 242 3 M5491_1.02 GLIS2 CDYYGCGGGGGGTC 8.7e-007 1.2e-009 -20.57 0.0 83 487 141 470 0.17043 4.8e-012 243 3 M5965_1.02 ZIC4 DCDCMGCGGGGGGYC 3.8e-035 5.1e-038 -85.87 0.0 124 486 295 572 0.25514 2.1e-040 242 3 M6144_1.02 TFAP2B BCCCBCRGGC 3.2e-001 4.3e-004 -7.74 0.0 73 491 132 599 0.14868 1.8e-006 245 3 M6146_1.02 TFAP2D ACGSGCCBCRGGCS 4.5e-002 6.0e-005 -9.72 0.0 125 487 180 501 0.25667 2.5e-007 243 3 M6266_1.02 GLI3 BTGGGTGGTCB 1.5e0000 2.0e-003 -6.23 0.0 96 490 147 537 0.19592 8.1e-006 244 3 M6267_1.02 GLIS3 GYGGGGGGTM 1.7e0000 2.3e-003 -6.09 0.0 129 491 201 586 0.26273 9.3e-006 245 3 M6274_1.02 HIC1 GGGKTGCCC 1.2e0000 1.6e-003 -6.44 0.0 22 492 50 570 0.04472 6.5e-006 245 3 M6322_1.02 KLF1 CAGGGTGKGGC 1.1e-001 1.5e-004 -8.83 0.0 98 490 168 589 0.20000 6.0e-007 244 3 M6326_1.02 KLF8 CMGGGKGTG 9.5e-007 1.3e-009 -20.48 0.0 98 492 184 575 0.19919 5.2e-012 245 3 M6339_1.02 MECP2 YYCCGGS 3.5e-003 4.7e-006 -12.26 0.0 134 494 188 487 0.27126 1.9e-008 246 3 M6420_1.02 PLAG1 GGRGGSMHNDVKAGGGG 8.8e0000 1.2e-002 -4.44 0.0 170 484 254 590 0.35124 4.9e-005 241 3 M6422_1.02 PLAGL1 CRGGGGGCCC 1.1e-006 1.4e-009 -20.37 0.0 211 491 332 581 0.42974 5.8e-012 245 3 M6527_1.02 TWIST1 MCCCAGGTGK 1.7e-002 2.2e-005 -10.71 0.0 95 491 147 507 0.19348 9.1e-008 245 3 M6548_1.02 ZIC1 KGGGWGGTS 5.5e-004 7.3e-007 -14.13 0.0 126 492 217 596 0.25610 3.0e-009 245 3 M6549_1.02 ZIC2 KGGGTGGTC 2.5e0000 3.3e-003 -5.72 0.0 120 492 192 598 0.24390 1.3e-005 245 3 M6550_1.02 ZIC3 BGGGTGGYC 6.0e-005 8.1e-008 -16.33 0.0 122 492 216 597 0.24797 3.3e-010 245 3 M6558_1.02 ZNF423 GCACCCTWGGGTGYC 1.0e-020 1.4e-023 -52.63 0.0 100 486 96 166 0.20576 5.7e-026 242 ## # Detailed descriptions of columns in this file: # # db: The name of the database (file name) that contains the motif. # id: A name for the motif that is unique in the motif database file. # alt: An alternate name of the motif that may be provided # in the motif database file. # consensus: A consensus sequence computed from the motif. # E-value: The expected number motifs that would have least one. # region as enriched for best matches to the motif as the reported region. # The E-value is the p-value multiplied by the number of motifs in the # input database(s). # adj_p-value: The probability that any tested region would be as enriched for # best matches to this motif as the reported region is. # By default the p-value is calculated by using the one-tailed binomial # test on the number of sequences with a match to the motif # that have their best match in the reported region, corrected for # the number of regions and score thresholds tested. # The test assumes that the probability that the best match in a sequence # falls in the region is the region width divided by the # number of places a motif # can align in the sequence (sequence length minus motif width plus 1). # When CentriMo is run in discriminative mode with a negative # set of sequences, the p-value of a region is calculated # using the Fisher exact test on the # enrichment of best matches in the positive sequences relative # to the negative sequences, corrected # for the number of regions and score thresholds tested. # The test assumes that the probability that the best match (if any) # falls into a given region # is the same for all positive and negative sequences. # log_adj_p-value: Log of adjusted p-value. # bin_location: Location of the center of the most enriched region. # bin_width: The width (in sequence positions) of the most enriched region. # A best match to the motif is counted as being in the region if the # center of the motif falls in the region. # total_width: The window maximal size which can be reached for this motif: # rounded(sequence length - motif length +1)/2 # sites_in_bin: The number of (positive) sequences whose best match to the motif # falls in the reported region. # Note: This number may be less than the number of # (positive) sequences that have a best match in the region. # The reason for this is that a sequence may have many matches that score # equally best. # If n matches have the best score in a sequence, 1/n is added to the # appropriate bin for each match. # total_sites: The number of sequences containing a match to the motif # above the score threshold. # p_success: The probability of falling in the enriched window: # bin width / total width # p-value: The uncorrected p-value before it gets adjusted to the # number of multiple tests to give the adjusted p-value. # mult_tests: This is the number of multiple tests (n) done for this motif. # It was used to correct the original p-value of a region for # multiple tests using the formula: # p' = 1 - (1-p)^n where p is the uncorrected p-value. # The number of multiple tests is the number of regions # considered times the number of score thresholds considered. # It depends on the motif length, sequence length, and the type of # optimizations being done (central enrichment, local enrichment, # score optimization).