Computer Architecture and VLSI

Grad Home

• Research

Architecture

Artificial Intelligence

Computational Biology

Databases and
Digital Libraries

Languages and
Compilation

Graphics

Operating Systems,
Networks and
Distributed Computing

Scientific and
Parallel Computing

Security

Theory of Computing

• Faculty

• Facilities

• Cornell

• Ithaca

• Application

Research in architecture and VLSI is part of the Computer Systems Laboratory. Computer Systems research at Cornell encompasses both experimental and theoretical work growing out of topics in computer architecture, parallel computer architecture, operating systems and compilers, computer protocols and networks, programming languages and environments, distributed systems, VLSI design, and system specification and verification.

Faculty members with primary interests in the architecture and VLSI area include:

Martin Burtscher

Professor Burtscher's research interests are in high-performance microprocessor architecture, instruction-level parallelism, and compiler optimizations. His work in the area of load-value prediction includes the development of novel confidence estimators to reduce the misprediction rate, a design approach that increases the number of correct predictions through enhanced hardware utilization, detailed analyses of hybrid predictors to discover well-complementing components, and several approaches to substantially decrease the size of hybrid value predictors without impacting their performance.

One result of this research is a very small yet high-performing load value predictor that outperforms other predictors from the literature by fifteen to thirty percent over a wide range of sizes. With about fifteen kilobytes of state, the smallest examined configuration, it surpasses the speedups delivered by other, four times larger predictors.

Dr. Burtscher is also investigating how compile-time classification can be used to mitigate the performance impact of long-latency CPU events. By making important decision at compile time in software rather than at runtime in hardware, it is also possible to reduce the power consumption of CPU components and to make them smaller and less complex without negatively affecting their performance.

Finally, Dr. Burtscher is conducting joint work with Professor Johannes Gehrke in the Computer Science department at Cornell on developing new prediction and confidence-estimation algorithms. In addition to speeding up processors, such algorithms can be used to efficiently compress the large trace files that typically result from the simulation of complex computer systems. A preliminary algorithm shows a compression ratio that is 1.6 times better than UNIX compress or Winzip.

Mark Heinrich

Professors Heinrich's research centers around flexible architectures for data-intensive computing, particularly exploring the ability to embed processing capability in the memory and I/O subsystems of traditional computer systems. This research has led to the development of three systems that take advantage of computational resources in places other than the CPU: active memory systems, active memory clusters, and active I/O.

Professor Heinrich's work in active memory systems has led to the development of a novel two-level approach to active memory systems. This system focuses on an active memory controller that leverages the cache coherence protocol to allow application address re-mapping to improve cache behavior, and active memory elements that can assist an active controller in performing data-intensive operations in the memory system itself. Initial performance results show uniprocessor speedups between 1.4 and 2.3 on a range of codes that perform matrix transposes, sparse matrix operations, or repeated linked-list traversals.

In collaborative work with Professor Evan Speight, Dr. Heinrich is currently developing a new memory controller architecture that, combined with emerging network technology from industry merges the research ideas of hardware distributed shared memory, active memory systems, and clusters to provide inexpensive hardware shared memory systems from industry-standard components. Preliminary results indicate performance comparable to that of the Origin 2000 machine available from SGI at a fraction of the cost.

In the area of I/O subsystems, Professor Heinrich is currently developing an active I/O architecture adds programmability to the I/O system and the ability to perform computation and filtering there on behalf of the CPU.

Rajit Manohar

Professor Manohar's research is concerned with the design of efficient asynchronous (clockless) computation structures in VLSI, and the use of formal methods to guarantee the correctness of such structures.

In work on formal methods, Professor Manohar has developed new techniques to analyze the correctness of a class of program transformations commonly used in asynchronous VLSI synthesis. The goal is to provide a top-down design methodology that provides a proof of correctness of the final circuit implementation without incurring the overhead of verification.

The amount of power required by a processor is quickly becoming a design constraint. Professor Manohar's group is working on a low energy asynchronous processor architecture that uses a number of novel adaptive techniques to minimize power consumption. Recent work by the group has shown how to design asynchronous pipelines that are both throughput and energy optimal.

There is a remarkable similarity between the design of asynchronous VLSI systems and networks because they are both event-driven. Professor Manohar's group is working on modelling computer networks in silicon, aiming to develop a hardware simulation infrastructure that can simulate wireless networks many orders of magnitude faster than real-time.

Evan Speight

Professor Speight's work centers around the development of novel runtime systems for distributed computing. Specific current research topics include the development of the Bifrost system for ubiquitous access to personal data, improved runtime environments for message passing parallel computing, research into the development of active memory clusters with Professor Mark Heinrich, and investigation into operating system support for fault tolerant software distributed shared memory systems.

The Bifrost system currently being developed at Cornell seeks to provide ubiquitous and intelligent access to personal data. One research topic included in the development of Bifrost focuses on the infrastructure necessary to allow access to a user's personal data (files, schedule, etc.) from any point connected to the Internet in a seamless manner, in contrast to the current amalgam of various protocols, services, and applications currently required. Additionally, Bifrost includes the concept of affinity, whereby data is associated with a user or a set of user, and the strength of the affinity between objects results in the automatic migration of data when a user changes geographic position. Finally, Bifrost provides access to data regardless of the application used to create the data, enabling functionality such as the reading of such things as Microsoft Word documents on mobile devices that may not have Word installed on them.

In the area of message-passing runtime systems, Professor Speight is currently focusing on improving the performance of the Message Passing Interface (MPI) runtime library, the most commonly-used runtime system for message passing parallel programming. The novel aspects of this system include a threading model of message passing programming instead of the normal process-to-process model employed by most current implementations. By employing a thread-centric approach, the runtime system can migrate threads between nodes participating in the computation in a matter of a few hundred microseconds, resulting in improved load balance, a reduction in the number of large messages, and more efficient fault tolerance for the system as a whole.

Finally, Professor Speight is currently examining operating system support for software distributed shared memory (SDSM) systems. SDSM systems consist of a runtime library that provides the abstraction of shared memory on top of a distributed memory machine (such as a cluster of workstations), enabling the use of shared-memory programming for application developers. This research focuses on changes to the operating system to provide increased performance, reliability, and fault tolerance. To date, this research has led to the development of a novel multiprogrammed runtime library in which threads from different application run simultaneously within the same process, greatly reducing cluster SDSM management and the overhead required for providing fault tolerance for multiple applications at the same time.

Faculty

Related Links

Computer Systems Lab

Research Groups and Projects

CSL People