Information Capture and Access
The information capture and access research group works on ways
that computers can locate information in the ever increasing volume of
online data, determine its structure, and extract the information for
human users. The group was founded by John Hopcroft and Jim Davis
Current areas of research
- Extracting structured material from online documents when the
structure is not explicit in the document - e.g. extracting
information presented in tabular form into a relational database.
- Constructing summaries and overviews of collections
- Construction of a nationwide library of computer science
technical reports. We have begun digitizing the Cornell Computer
Science technical report collection, in order to make the work more
accessible on the Internet. The collection is available through a WWW server. In addition to
its utility to the general CS research community, We use this
document collection as test material for our research in information access.
The group consists of Cornell researchers Dean Krafft and visiting
Davis as well as a number of graduate and undergraduate students.
Fall 95: The project is not active any longer. - JRD
James Allan et al. Information
Agents for Building Hyperlinks, Proceedings of the 2nd Conference on
Information and Knowledge Management, 1993.