******************************************************************************** MAST - Motif Alignment and Search Tool ******************************************************************************** MAST version 5.0.0 (Release date: Wed Oct 11 17:39:42 2017 -0700) For further information on how to interpret these results or to get a copy of the MAST software please access http://meme-suite.org . ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Michael Gribskov, "Combining evidence using p-values: application to sequence homology searches", Bioinformatics, 14(48-54), 1998. ******************************************************************************** ******************************************************************************** DATABASE AND MOTIFS ******************************************************************************** DATABASE crp0.s (nucleotide) Last updated on Tue Feb 2 11:33:11 2016 Database contains 18 sequences, 1890 residues Scores for positive and reverse complement strands are combined. MOTIFS meme.crp0.cd.zoops.txt (nucleotide) MOTIF ID ALT ID WIDTH BEST POSSIBLE MATCH ----- ------------- ------ ----- ------------------- 1 RWAWRYKGWGKGR MEME-1 13 GAAAGCGGAGGGG 2 ATTCCTDA MEME-2 8 ATTCCTAA PAIRWISE MOTIF CORRELATIONS: MOTIF 1 ----- ----- 2 0.26 No overly similar pairs (correlation > 0.60) found. Random model letter frequencies (from non-redundant database): A 0.274 C 0.225 G 0.225 T 0.274 ******************************************************************************** ******************************************************************************** SECTION I: HIGH-SCORING SEQUENCES ******************************************************************************** - Each of the following 12 sequences has E-value less than 10. - The E-value of a sequence is the expected number of sequences in a random database of the same size that would match the motifs as well as the sequence does and is equal to the combined p-value of the sequence times the number of sequences in the database. - The combined p-value of a sequence measures the strength of the match of the sequence to all the motifs and is calculated by o finding the score of the single best match of each motif to the sequence (best matches may overlap), o calculating the sequence p-value of each score, o forming the product of the p-values, o taking the p-value of the product. - The sequence p-value of a score is defined as the probability of a random sequence of the same length containing some match with as good or better a score. - The score for the match of a position in a sequence to a motif is computed by by summing the appropriate entry from each column of the position-dependent scoring matrix that represents the motif. - Sequences shorter than one or more of the motifs are skipped. - The table is sorted by increasing E-value. ******************************************************************************** SEQUENCE NAME DESCRIPTION E-VALUE LENGTH ------------- ----------- -------- ------ lac 9 80 0.042 105 pbr322 53 0.052 105 ilv 39 0.075 105 malk 29 61 0.23 105 bglr1 76 0.59 105 uxu1 17 1.2 105 ompa 48 1.7 105 deop2 7 60 2.2 105 male 14 2.6 105 ara 17 55 5 105 malt 41 7.6 105 cya 50 9.1 105 ******************************************************************************** ******************************************************************************** SECTION II: MOTIF DIAGRAMS ******************************************************************************** - The ordering and spacing of all non-overlapping motif occurrences are shown for each high-scoring sequence listed in Section I. - A motif occurrence is defined as a position in the sequence whose match to the motif has POSITION p-value less than 0.0001. - The POSITION p-value of a match is the probability of a single random subsequence of the length of the motif scoring at least as well as the observed match. - For each sequence, all motif occurrences are shown unless there are overlaps. In that case, a motif occurrence is shown only if its p-value is less than the product of the p-values of the other (lower-numbered) motif occurrences that it overlaps. - The table also shows the E-value of each sequence. - Spacers and motif occurences are indicated by o -d- `d' residues separate the end of the preceding motif occurrence and the start of the following motif occurrence o [sn] occurrence of motif `n' with p-value less than 0.0001. A minus sign indicates that the occurrence is on the reverse complement strand. ******************************************************************************** SEQUENCE NAME E-VALUE MOTIF DIAGRAM ------------- -------- ------------- lac 0.042 67-[+1]-25 pbr322 0.052 47-[+1]-45 ilv 0.075 50-[-1]-42 malk 0.23 53-[-2]-44 bglr1 0.59 50-[+2]-47 uxu1 1.2 105 ompa 1.7 105 deop2 2.2 105 male 2.6 105 ara 5 105 malt 7.6 105 cya 9.1 105 ******************************************************************************** ******************************************************************************** SECTION III: ANNOTATED SEQUENCES ******************************************************************************** - The positions and p-values of the non-overlapping motif occurrences are shown above the actual sequence for each of the high-scoring sequences from Section I. - A motif occurrence is defined as a position in the sequence whose match to the motif has POSITION p-value less than 0.0001 as defined in Section II. - For each sequence, the first line specifies the name of the sequence. - The second (and possibly more) lines give a description of the sequence. - Following the description line(s) is a line giving the length, combined p-value, and E-value of the sequence as defined in Section I. - The next line reproduces the motif diagram from Section II. - The entire sequence is printed on the following lines. - Motif occurrences are indicated directly above their positions in the sequence on lines showing o the motif number of the occurrence (a minus sign indicates that the occurrence is on the reverse complement strand), o the position p-value of the occurrence, o the best possible match to the motif (or its reverse complement), and o columns whose match to the motif has a positive score (indicated by a plus sign). ******************************************************************************** lac 9 80 LENGTH = 105 COMBINED P-VALUE = 2.33e-03 E-VALUE = 0.042 DIAGRAM: 67-[+1]-25 [+1] 4.5e-06 GAAAGCGG ++++++++ 1 AACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTG AGGGG +++++ 76 TGTGGAATTGTGAGCGGATAACAATTTCAC pbr322 53 LENGTH = 105 COMBINED P-VALUE = 2.91e-03 E-VALUE = 0.052 DIAGRAM: 47-[+1]-45 [+1] 4.5e-06 GAAAGCGGAGGGG +++++++++++++ 1 CTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATG ilv 39 LENGTH = 105 COMBINED P-VALUE = 4.18e-03 E-VALUE = 0.075 DIAGRAM: 50-[-1]-42 [-1] 4.5e-06 CCCCTCCGCTTTC +++++++++++++ 1 GCTCCGGCGGGGTTTTTTGTTATCTGCAATTCAGTACAAAACGTGATCAACCCCTCAATTTTCCCTTTGCTGAAA malk 29 61 LENGTH = 105 COMBINED P-VALUE = 1.26e-02 E-VALUE = 0.23 DIAGRAM: 53-[-2]-44 [-2] 4.3e-05 TTAGGAAT ++++++++ 1 GGAGGAGGCGGGAGGATGAGAACACGGCTTCTGTGAACTAAACCGAGGTCATGTAAGGAATTTCGTGATGTTGCT bglr1 76 LENGTH = 105 COMBINED P-VALUE = 3.27e-02 E-VALUE = 0.59 DIAGRAM: 50-[+2]-47 [+2] 4.3e-05 ATTCCTAA ++++++++ 1 ACAAATCCCAATAACTTAATTATTGGGATTTGTTATATATAACTTTATAAATTCCTAAAATTACACAAAGTTAAT uxu1 17 LENGTH = 105 COMBINED P-VALUE = 6.55e-02 E-VALUE = 1.2 DIAGRAM: 105 ompa 48 LENGTH = 105 COMBINED P-VALUE = 9.55e-02 E-VALUE = 1.7 DIAGRAM: 105 deop2 7 60 LENGTH = 105 COMBINED P-VALUE = 1.21e-01 E-VALUE = 2.2 DIAGRAM: 105 male 14 LENGTH = 105 COMBINED P-VALUE = 1.46e-01 E-VALUE = 2.6 DIAGRAM: 105 ara 17 55 LENGTH = 105 COMBINED P-VALUE = 2.78e-01 E-VALUE = 5 DIAGRAM: 105 malt 41 LENGTH = 105 COMBINED P-VALUE = 4.21e-01 E-VALUE = 7.6 DIAGRAM: 105 cya 50 LENGTH = 105 COMBINED P-VALUE = 5.06e-01 E-VALUE = 9.1 DIAGRAM: 105 ******************************************************************************** CPU: Timothys-iMac.rd.unr.edu Time 0.004 secs. mast -oc results/mast7 -nostatus meme/meme.crp0.cd.zoops.txt common/crp0.s