******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 3.0 (Release date: 2003/01/01 22:28:02) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.sdsc.edu. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.sdsc.edu. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= combo_1485-001-001.implant.seq ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 1041553598-28-0 1.0000 1501 1041553598-28-1 1.0000 1501 1041553598-28-10 1.0000 1501 1041553598-28-11 1.0000 1501 1041553598-28-12 1.0000 1501 1041553598-28-13 1.0000 1501 1041553598-28-14 1.0000 1501 1041553598-28-15 1.0000 1501 1041553598-28-16 1.0000 1501 1041553598-28-17 1.0000 1501 1041553598-28-18 1.0000 1501 1041553598-28-19 1.0000 1501 1041553598-28-2 1.0000 1501 1041553598-28-3 1.0000 1501 1041553598-28-4 1.0000 1501 1041553598-28-5 1.0000 1501 1041553598-28-6 1.0000 1501 1041553598-28-7 1.0000 1501 1041553598-28-8 1.0000 1501 1041553598-28-9 1.0000 1501 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme combo_1485-001-001.implant.seq -mod oops -nmotifs 1 -w 16 model: mod= oops nmotifs= 1 evt= inf object function= E-value of product of p-values width: minw= 16 maxw= 16 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 20 maxsites= 20 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 30020 N= 20 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.250 C 0.252 G 0.245 T 0.253 Background letter frequencies (from dataset with add-one prior applied): A 0.250 C 0.252 G 0.245 T 0.253 ******************************************************************************** ******************************************************************************** MOTIF 1 width = 16 sites = 20 llr = 243 E-value = 1.4e-010 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A a:8:35937:4:::8: pos.-specific C :3:8:1:1:86386:4 probability G :7:1:2:3:3::1::6 matrix T ::3:73143::7142: bits 2.0 * 1.8 * 1.6 * * 1.4 * * * Information 1.2 * ** * * * content 1.0 ***** * ******** (17.6 bits) 0.8 ***** * ******** 0.6 ***** * ******** 0.4 ***** * ******** 0.2 **************** 0.0 ---------------- Multilevel AGACTAATACCTCCAG consensus CT AT ATGAC TTC sequence G G -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 1041553598-28-7 235 5.17e-08 CTACAGTTAG ACACTGATACATCCAG GAATCGGCGG 1041553598-28-1 1218 9.14e-08 GATCGGACCC AGACATAGACATCCAG AATGAGTTGC 1041553598-28-8 1029 1.03e-07 GCGACATAGA AGACTAAATGCTCCAG GACGGCTATG 1041553598-28-14 133 1.64e-07 GATATGCAAT AGACTAATACCCGCAG GCGTTATAAG 1041553598-28-13 1431 2.58e-07 TGGCGAAGCG AGTCTAAGTCCTCCAC TGCCGACTGG 1041553598-28-5 1429 3.34e-07 CTACATATAT ACACTAAGACATCCTG GCAATGGCGT 1041553598-28-11 128 3.72e-07 GACGATCTGT ACTCTGATACCTCTAG AGCGCCGTCG 1041553598-28-3 440 4.25e-07 TTCAACCGTT AGACACAAACCTCCAC ACGTTGGGTA 1041553598-28-9 70 5.34e-07 CCTTATAGTG ACACTAAATGCTCCAC TCAGCTGGAG 1041553598-28-4 1313 6.06e-07 CCGTTTACTC ACACTAATAGCCCTAC CGTGAGCGCC 1041553598-28-10 324 6.64e-07 TTCGTCCTAC AGACAGAAACACCTAG TTCTGGGCAC 1041553598-28-17 519 1.01e-06 CCAGGTCCTT AGAGATAAACCTCTAG CAGTCCTTTA 1041553598-28-15 827 2.27e-06 GACGGTTGCA ACACTTATTGCCCTAC GTGAGCCACG 1041553598-28-16 1117 2.67e-06 ACGTTTAACA AGAGTTATTCATCCTG ATGCAGACCT 1041553598-28-2 628 2.86e-06 GATGACACAC AGTCAAACACCCCTAC TGCATGATCT 1041553598-28-0 1248 2.86e-06 GGGTAGACCG AGACTCATAGATCCTC TGCTAAGGTG 1041553598-28-19 1383 3.32e-06 CTCGTAGGAA AGAGAAATACCTTCAG TAAATCTAGA 1041553598-28-6 563 4.10e-06 CGAAATACTC AGACTATGACACCTAC AGGATAAGGT 1041553598-28-18 301 7.93e-06 TCCTTTCGGT ACTCTGACTCCTGCAG TGTCTATCCT 1041553598-28-12 1151 1.33e-05 TAGTGGATAG AGTCATAGTCATGTTG TGCTATCCAT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1041553598-28-7 5.2e-08 234_[1]_1251 1041553598-28-1 9.1e-08 1217_[1]_268 1041553598-28-8 1e-07 1028_[1]_457 1041553598-28-14 1.6e-07 132_[1]_1353 1041553598-28-13 2.6e-07 1430_[1]_55 1041553598-28-5 3.3e-07 1428_[1]_57 1041553598-28-11 3.7e-07 127_[1]_1358 1041553598-28-3 4.2e-07 439_[1]_1046 1041553598-28-9 5.3e-07 69_[1]_1416 1041553598-28-4 6.1e-07 1312_[1]_173 1041553598-28-10 6.6e-07 323_[1]_1162 1041553598-28-17 1e-06 518_[1]_967 1041553598-28-15 2.3e-06 826_[1]_659 1041553598-28-16 2.7e-06 1116_[1]_369 1041553598-28-2 2.9e-06 627_[1]_858 1041553598-28-0 2.9e-06 1247_[1]_238 1041553598-28-19 3.3e-06 1382_[1]_103 1041553598-28-6 4.1e-06 562_[1]_923 1041553598-28-18 7.9e-06 300_[1]_1185 1041553598-28-12 1.3e-05 1150_[1]_335 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in FASTA format -------------------------------------------------------------------------------- >1041553598-28-7 pos 235 ACACTGATACATCCAG >1041553598-28-1 pos 1218 AGACATAGACATCCAG >1041553598-28-8 pos 1029 AGACTAAATGCTCCAG >1041553598-28-14 pos 133 AGACTAATACCCGCAG >1041553598-28-13 pos 1431 AGTCTAAGTCCTCCAC >1041553598-28-5 pos 1429 ACACTAAGACATCCTG >1041553598-28-11 pos 128 ACTCTGATACCTCTAG >1041553598-28-3 pos 440 AGACACAAACCTCCAC >1041553598-28-9 pos 70 ACACTAAATGCTCCAC >1041553598-28-4 pos 1313 ACACTAATAGCCCTAC >1041553598-28-10 pos 324 AGACAGAAACACCTAG >1041553598-28-17 pos 519 AGAGATAAACCTCTAG >1041553598-28-15 pos 827 ACACTTATTGCCCTAC >1041553598-28-16 pos 1117 AGAGTTATTCATCCTG >1041553598-28-2 pos 628 AGTCAAACACCCCTAC >1041553598-28-0 pos 1248 AGACTCATAGATCCTC >1041553598-28-19 pos 1383 AGAGAAATACCTTCAG >1041553598-28-6 pos 563 AGACTATGACACCTAC >1041553598-28-18 pos 301 ACTCTGACTCCTGCAG >1041553598-28-12 pos 1151 AGTCATAGTCATGTTG -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 29720 bayes= 10.5362 E= 1.4e-010 200 -1097 -1097 -1097 -1097 47 141 -1097 158 -1097 -1097 -2 -1097 175 -71 -1097 49 -1097 -1097 136 85 -133 -29 -2 193 -1097 -1097 -233 0 -133 3 66 138 -1097 -1097 47 -1097 157 3 -1097 68 125 -1097 -1097 -1097 25 -1097 147 -1097 166 -71 -233 -1097 125 -1097 66 168 -1097 -1097 -34 -1097 66 129 -1097 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 n= 29720 E= 1.4e-010 0.999625 0.000126 0.000122 0.000126 0.000125 0.349951 0.649798 0.000126 0.749750 0.000126 0.000122 0.250001 0.000125 0.849701 0.150047 0.000126 0.349950 0.000126 0.000122 0.649801 0.449900 0.100076 0.200022 0.250001 0.949650 0.000126 0.000122 0.050101 0.250000 0.100076 0.249998 0.399926 0.649800 0.000126 0.000122 0.349951 0.000125 0.749751 0.249998 0.000126 0.399925 0.599826 0.000122 0.000126 0.000125 0.299976 0.000122 0.699776 0.000125 0.799726 0.150047 0.050101 0.000125 0.599826 0.000122 0.399926 0.799725 0.000126 0.000122 0.200026 0.000125 0.399926 0.599823 0.000126 -------------------------------------------------------------------------------- Time 29.89 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 1041553598-28-0 4.24e-03 670_[1(8.61e-05)]_561_[1(2.86e-06)]_238 1041553598-28-1 1.36e-04 1217_[1(9.14e-08)]_268 1041553598-28-10 9.87e-04 323_[1(6.64e-07)]_1162 1041553598-28-11 5.53e-04 127_[1(3.72e-07)]_1358 1041553598-28-12 1.96e-02 1150_[1(1.33e-05)]_335 1041553598-28-13 3.84e-04 1430_[1(2.58e-07)]_55 1041553598-28-14 2.44e-04 132_[1(1.64e-07)]_1353 1041553598-28-15 3.37e-03 826_[1(2.27e-06)]_659 1041553598-28-16 3.96e-03 1116_[1(2.67e-06)]_369 1041553598-28-17 1.51e-03 518_[1(1.01e-06)]_967 1041553598-28-18 1.17e-02 300_[1(7.93e-06)]_1185 1041553598-28-19 4.92e-03 996_[1(9.71e-05)]_370_[1(3.32e-06)]_103 1041553598-28-2 4.24e-03 627_[1(2.86e-06)]_858 1041553598-28-3 6.31e-04 439_[1(4.25e-07)]_1007_[1(7.67e-05)]_23 1041553598-28-4 9.00e-04 1312_[1(6.06e-07)]_173 1041553598-28-5 4.96e-04 1428_[1(3.34e-07)]_57 1041553598-28-6 6.07e-03 562_[1(4.10e-06)]_923 1041553598-28-7 7.69e-05 234_[1(5.17e-08)]_1251 1041553598-28-8 1.53e-04 931_[1(5.78e-05)]_81_[1(1.03e-07)]_457 1041553598-28-9 7.93e-04 69_[1(5.34e-07)]_281_[1(6.86e-05)]_1119 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 1 reached. ******************************************************************************** CPU: blarg ********************************************************************************