CS 5150
Software Engineering
Fall 2009
Project Suggestion:
Legal Information Institute
|
The Legal Information Institute Client Tom Bruce, Director of the Legal Information Institute, Cornell Law School, trb2@cornell.edu. The Legal Information Institute Cornell Law School's Legal Information Institute (LII) is a pre-eminent publisher of open access electronic legal information. It accounts for over 20 percent of Cornell's Web traffic, reaches users in more than 200 countries and territories, and receives more than a million page views each day. It is leader in developing applications that work with legal information and make it more accessible to the public. In previous years, there have been several successful CS 501 projects for the Legal Information Institute. The Spaeth database of Supreme Court statistics The LII receives numerous questions about the voting records of Supreme Court justices. These come from a wide range of people, including high-school and college students writing papers, political-science researchers, ordinary citizens, and journalists. For example, the LII has done research projects for the staff of Sixty Minutes, and for a reporter at the Washington Post writing a book on Clarence Thomas. The main source of answers for such questions is the Spaeth database, a comprehensive database of Supreme Court statistics developed and maintained by political scientists. It is very difficult to understand and use, which is one reason that people come to the LII for answers rather than consult the Spaeth database directly. Another reason is that the LII is easier to find, and widely recognized as a publisher of Supreme Court opinions. The purpose of this high-visibility project is to make the information in the Spaeth database easy for an average person to use, and to capture collective wisdom about its contents. Part of the challenge is that its underlying data model is hard to understand, and part is that the database itself is very compactly (some would say cryptically) encoded, in the style of social scientists of perhaps 40 years ago. Development and capture of user-contributed queries The project will take a generalized approach to the development and capture of user-contributed queries. Researchers and other data-compilers, such as governments, increasingly have the ability (and the desire) to expose large compilations of data to the public via the Internet. Typically the data is made available to the public through dynamic web pages that, on one level, are simply the output of those database queries that the publisher believes the audience might like to make, expressed as a web page or pages. A difficulty with this approach is that the public is simply viewing the data in those ways that the publisher can anticipate and hard-wire into prepackaged query-and-display systems. It would be better if we could build systems that allowed users to build and capture their own queries about the underlying datasets, since it is virtually impossible for the publishers of datasets to anticipate all of the audiences for their information. The need for this is particularly urgent at a time when government is about to release large quantities of data to the government as bulk XML. The project, then, is to build a system that will show an (arbitrary) XML database to an audience that may or may not be knowledgeable about its contents and allow that audience to build queries about the data, store those queries under descriptive labels, and share them with others. The user should not need to understand either XML or any particular query language. In specific terms:
This generalized problem is a major software challenge. Therefore, the targets for this project is the development and capture of queries to the Spaeth database of Supreme Court voting records. |
[ Home | Notices | Syllabus | Projects | Readings | Assignments | Quizzes | Academic Integrity | About ]
wya@cs.cornell.edu
Last changed: May, 2009