Guided Tour (Part 2)

Browsing a selected cluster

In any of the access methods described above, clicking the cluster number will give you the following detailed information about the cluster:

  1. the list of members
  2. summary
  3. tree-like graphical presentation
  4. higher level constituents
  5. related clusters
List of members

In the list of members , proteins are ordered first by their order of transitivity (see below), followed by the number of new members they added to the cluster (first and second numbers in the second column, respectively). Next to a short description of each protein, all PROSITE patterns which appear in the protein are given (clicking the pattern name yields detailed information about the corresponding family). For detailed information on a specific protein press its name.



The order of transitivity: The first protein is the seed, whose order of transitivity is 0. Its neighbors are of order 1. Additional proteins that are neighbors of 1st order proteins, are of order 2, etc.
Summary

The summary includes the total number of members, the highest order of transitivity, and summary of all PROSITE patterns that appear in this cluster.

Graphical representation
Figure 1. Graphical presentation of cluster

The graphical presentation places each protein at a leaf of the tree (in red). The root of the tree is the rightmost vertex (see Figure 1). The location of internal nodes of the tree indicates the similarity among the proteins at the leaves of the corresponding subtree. The higher it is, the node is located more to the left of the screen.


View options:
  • You can zoom in on any part of the tree, by clicking and dragging the mouse, to mark the desired view. Successive higher zooming is possible.
  • Stand on a vertex (without pressing) to get a summary line of all proteins descendent from this vertex.
  • Press on a vertex for a detailed list of all proteins descendent from the vertex (as in the figure to the right).
  • Click on an item in the list for SWISSPROT information about the selected protein.
Higher level constituents

This tool allows you to move between levels and trace the formation of clusters. Isolated clusters in a given level may unite when the threshold is lowered. This tool shows a graphical presentation of the clusters which were isolated at the higher confidence level, and formed the current cluster you are looking at, at the current lower threshold, and the connections between these clusters (see Figure 2). This way you trace the formation of families out of sub-families.

Figure 2. Higher level constituents of cluster

Each circle stands for a cluster at the higher threshold level. Circles' radii are proportionate to the cluster's size. The cluster's serial number and its size, in parentheses, appear next to the corresponding circle. The drawn edges represent new connections between the clusters which were formed upon lowering the threshold. Edge widths are proportionate to the number of connections between the corresponding clusters. The numbers on either side of an edge, connecting clusters A and B, indicate the number of different proteins in clusters A and B which are connected. For example, the numbers on the edge, which connects clusters 44 and 2399 in Figure 2, indicate that 2 proteins from cluster 44 are connected to 1 protein from cluster 2399.



View options:
  • Click on a vertex for the list of members in the corresponding cluster.
  • Click on an edge to get the list of pairs connecting the corresponding vertices.
  • Press the 'Lower Level' button (at the bottom of the window) to move one level lower.

When clicking on a vertex, a new window will appear, with detailed information about the corresponding cluster (see figure to the right). Besides the list of members and the summary, you'll be able to see the tree-like presentation of the specific cluster (disabled when the cluster has less than 3 members), as well as its own higher-level constituents.

This time, when pressing higher-order constituents you'll move up to the next building level, at which the cluster decomposes to smaller components (if the cluster does not change till the level of 1e-100, or if you are already at the level of 1e-100 then this button is disabled). The new graph will replace the original graph.
You may continue in the same manner, and move even higher, or you can check other components in the original graph or in the consequent graphs. Use the buttons at the bottom of the window to move from one level to another (higher-level, lower-level) for already browsed graphs, or to go back to the original graph.

When clicking on an edge, a new window will appear, with the list of pairs connecting the corresponding vertices (clusters). The weight (confidence level) of each edge is given next to the corresponding pair (see figure to the right).

Select a pair and press 'Align' to get the pairwise alignment of the pair. The alignment will be presented schematically. Each protein is presented as a gray bar, and the shared/similar regions composing the alignment are presented as yellow segments within the gray bars. Likewise, all PROSITE patterns that appear in the sequences are presented schematically, as blue segments, to indicate their position (see figure to the left). Clicking on a blue segment in a protein will give a short description of the corresponding pattern/family.

This schematic representation visualize the characteristic of the sequences and the alignment. This way you can immediately recognize if the proteins are composed of more than one domain and if the alignment is stretched all over the two sequences, or limited to the active site region, etc.
You can get the detailed alignment in text format, by pressing 'Alignment as text'. Pressing the 'Info' buttons, next to each bar, will open a window with the SWISSPROT information about the corresponding protein.

press the 'Lower Level' button (at the bottom of the window) to move one level lower. This way, you'll get a new graph, presenting the connections created between the current cluster you are looking at, and other clusters, when the threshold is lowered one step further. The representation is the same as before. If no other clusters are added than you'll get only single circle. If the level is 1e-0 than this button is disabled. Again, you can use the buttons at the bottom of the window to move from one level to another (higher-level, lower-level) for already browsed graphs, or to go back to the original graph (original graph).

Related Clusters

The clustering algorithm automatically rejects many possible connections among clusters. This happens whenever the quality associated with a connection falls below a certain threshold. Many of these rejected connections are nevertheless meaningful and reflect genuine though distant homologies. We refer to the rejected mergers as possibly related clusters.

In examining a given cluster, much insight can be gained by observing those clusters which are possibly related to it. Even though some of these connections are justifiably rejected, in particular at the lowest level of confidence we consider ($1e-0), many others do reflect structural/functional similarities, despite a weak sequence similarity.

When pressing the link related clusters you get the clusters that are possibly related to the cluster under study. In this list clusters are ordered by quality of relatedness, and the number of connections (3rd and 4th columns, respectively). At this stage it is hard to give exact rules for evaluating these relations, and one's judgment must be used. Such judgment can also take into account pairwise alignments of protein pairs, one from the cluster under study and one from a possibly related cluster. You can get the alignments by pressing the link 'List of Connections' and then press the button 'SEE ALIGNMENTS' at the bottom of the page.


Copyright © 2000 Golan Yona and the ProtoMap authors,
email to:
protomap

Last modified: Fri Apr 21 12:48:08 PDT 2000