Guided Tour (Part 1)

Entering the analyzed database

As mentioned in the introduction, the analysis was carried at varying thresholds, or levels of confidence, in the range 1e-100 (very high confidence level) to 1e-0 = 1 (almost pure chance similarities).
At each level, the universe of all proteins splits into clusters, which get larger and coarser as we decrease confidence levels. Each cluster is given a number. Clusters are ordered in decreasing order of size.


Selecting the entrance level

The default entrance level is 1e-0, but many interesting relations are revealed at higher levels as well. In any case, your selection of the entrance level does not limit your search as you advance in your analysis. You'll be able to move from one level to another, to track the fusion of sub-families into larger families, and to identify links between protein families.


Search

Search is ALWAYS performed at a specific level (see above).

search options:
  1. by size (all clusters of size n where: n1 <= n <= n2)
  2. by keyword (e.g toxin, atpase, coiled coil, duplication ...)
  3. by SWISSPROT accession number (or ID)
  4. by protein name (e.g. synaptotagmin, phosphofructokinase, brevinin, dynamin ...)
  5. by cluster number
By size
You can select all clusters between size n1 and n2 (default is all clusters larger than 100 members).
By keyword

search by keyword(s) yields all the clusters matching the keyword(s).

examples:
  • toxin
  • MHC III
  • atp-binding AND transmembrane
  • g-protein coupled receptor OR photoreceptor
  • (dna-binding OR rna-binding) AND 3d-structure

You can use the logical connectives AND, OR, NOT, XOR, to create a complex query, and use parentheses, to define precedence.
The search is NOT case sensitive (also logical connectives need not be in upper case letters).

Globbing patterns (with '*' and '?' characters) are acceptable. e.g.
  • struc*
  • ?TP-binding
Figure 1. The results of the search on keyword "globin"

The search outputs a table with all clusters matching your query (click on image to get the full size image). Each cluster is given with its number (clusters are ordered by size), its size, and the keywords associated with the cluster. If you click on one of the keywords, a new search is activated with the clicked word as the new keyword.

By SWISSPROT accession number

you can type the accession number (or the ID) of a specific protein from SWISSPROT database.

for example:
  • P04567
  • fib_rat

The search yields a card with all the information available about the protein and the specific cluster which contains the selected protein.

Figure 2. The results of the search for protein-ID BLA3_KLEPN
    The card offers the following information:
  • SWISSPROT documentation
  • all PROSITE patterns that appear in the sequence. (clicking the pattern name yields detailed information about the corresponding family)
  • the cluster which contains the selected protein.
  • keywords associated with this cluster.
  • the full list of neighbors, combined of all the three methods for sequence comparison (SW, BLAST, and FASTA), and corrected to a single reference line.

By protein name
you can type a protein name (e.g. synaptotagmin, brevinin, dynamin) or a word/substring which is part of a protein definition.
Figure 3. The results of the search on the string 'synap'
The search yields a table with the proteins in the SWISSPROT database that match this name, and their corresponding clusters.
Globbing patterns (with '*' and '?' characters) are acceptable. e.g. synapto*
By cluster number
You can access directly a specific cluster by typing its number.

Copyright © 2000 Golan Yona and the ProtoMap authors,
email to: protomap


Last modified: Fri Apr 21 12:48:08 PDT 2000