GIMSAN (GIbbsMarkov with Significance ANalysis) is a web-server tool for de novo motif discovery:

It is available as a stand-alone application on Unix and PBS (Portable Batch System) cluster:

Download GIMSAN

(2010/05/02) Column pair dependency added to Unix version

Frequently Asked Questions (FAQ) Please send any additional questions, comments, or suggestions to ppn3 "at"

GIMSAN input options
Options Cmdline Description
input FASTA --f a set of sequences in FASTA format for de novo motif discovery
background model --bg file (FASTA format) for background model estimation. For example, this can be a set of S. Cerevisiae intergenic sequences. This data is used to generate null sets of sequences that preserve the dimensions and local GC-content of the input set, as well as estimating the background model for the de novo motif-finding task.

Note: It is recommended that the user either "upload your own genomic file" or use "one of our standard genomic files".
motif widths --w user can specify a range of motif widths (e.g. {8,14,20,30}). Once the GIMSAN job has completed, user can select the optimal width by choosing the motif with the lowest p-value (i.e. highest significance).
size of nullset --nullset size of the randomly drawn set to estimate the motif-finder's null distribution based on 3-Gamma approximation. A larger null set would give a more accurate p-value at the expense of longer runtime.
number of processors   the number of processors to allocate on the computer cluster. The specified number of processors are allocated before any execution of the job. Therefore, it is recommended that this parameter should be set to less than 10.

GIMSAN sample output on FHL1 motif

Commercial use of GIMSAN without written permission from the authors is prohibited. If you use this program in your research, please cite: The motif significance analysis approach is described in detail in: If you use the sequence logos from GIMSAN in your research, please cite WebLogo.

We thank Robert Bukowski for deploying GIMSAN as a web application. GIMSAN is based upon work supported by the National Science Foundation under Grant No. 0644136.

Please send any questions, comments, or suggestions to ppn3 "at"