Current Status

[ Up ] [ Current Status ] [ Previous Status ]

Computational Chemistry - Q2 1999 Status Report

July 1999

Computational chemistry has become one of the hottest areas of current chemical research, and now even has its own research journals in addition to the hundreds of articles that appear each month in the traditional chemistry journals. This explosion has come about because of the affordable availability of computing power that didn’t exist as little as ten years ago. The increase in computing power along with the advances in molecular visualization techniques has made the older concepts of QSAR (Quantitative Structure-Activity Relationships) the basis for a number of successes in practical rational drug design. As astounding as these gains have been, there is an ever expanding need for still more computing power.

The Chemistry Department at Cornell has long recognized the need for training Ph.D. students and Chemistry majors in these topics. Realistically, this kind of training can only be acquired by hands-on experience with the various techniques of computational chemistry. We have a dedicated graduate course, Chem 765, that provides just such experience. In previous years we used the Cornell University IBM SP2 supercomputer to carry out the numerically demanding ab initio calculations that are so central to this course. Unfortunately, this required the students to learn and use UNIX and vi, which for a significant fraction of the class did not work out well. It appears that many students, having grown up with graphical interfaces, just don’t relate to line-oriented operating systems and editors. The frustrations of using batch processing, with its frequent long delays, didn’t help.

This last semester we were able to do all of the course computations on our cluster of Intel Pentium workstations running under the graphical Windows NT operating system. The improved student response was fantastic. The course ratings went from a mediocre 3.0 (scale of 5) to an outstanding 4.5. The subject matter was essentially identical, just a change in the computer and its interface.

The obvious question is what kind of a hit did we take on speed. As it turns out, the computational package we used , Gaussian, did not benefit from the parallelization of our supercomputer (although they do sell a version that speeds up on other supercomputers) so that the speed ratio came down to a single RS/6000 node against a single Intel. With the fastest Intel in our cluster the speed ratio was measured recently to be only a factor of slightly less than 2 on a job requiring 1 hour of IBM SP2 time. Better, the Intel does not have the time constraints imposed by our theory center, with the result that some of the students were able to run jobs that would have timed out on the supercomputer (one student ran a job that took 23 days).

The Intel machines in our cluster all contain two processors and, in principle, should be able to approximately double the speed of any single job. We have a site licence for the Gaussian program and for the multiprocessor Portland Group compiler that Gaussian recommends. However, we have been unable to produce an executable program because of some inconsistency between the library and the Gaussian program. The software engineers of both companies have been alerted to the problem and are working to resolve it.

We have much work left to do. We have spent most of our energy developing support packages that allow the large computationally-intensive Gaussian package less frustrating. The issue here is that the computations can take a long time and if a silly error is made in the input, it will take some time (typically hours) before it is discovered. To help here we have written a Windows program that allows the facile input of structural information and the immediate display of the corresponding structure. The generated molecule can be scaled and rotated in three dimensions with the result that most of the common input errors become visually obvious (e.g., some torsional angle really should have been negative).

For the other end of the Gaussian computation we have developed a Windows program that accepts the binary output file and displays the geometrically optimized structure. When appropriate, the program can display each of the calculated vibrational modes with the atoms following their classical oscillations. The program also reads the data in the output file and massages it to yield corrected thermodynamic parameters ()H, )S, and )G) using appropriately scaled frequencies.

These two programs will be used for the first time this fall. Together they should relieve many hours of error induced tedium.

Last modified on: 10/12/99