CS 501
Software Engineering
Spring 2008

Project Suggestion:
Repository for Cornell Research


CS 501 Home

Syllabus

Projects

Books and Readings

Assignments

Quizzes

Academic Integrity


About this site

 

Client

John M. Saylor, Cornell University Library
http://www.englib.cornell.edu/jms/
email: jms1@cornell.edu
phone: 607-255-4134

Repository for Cornell Research

eCommons@Cornell is a digital repository that is open to anyone affiliated with Cornell University. It is a place to capture, store, index, preserve and redistribute materials in digital formats for educational, scholarly, research or historical purposes (http://ecommons.library.cornell.edu/).

Two projects are proposed with the aim of increasing the use of the repository. Cornell faculty, students, and staff regularly post pre and post prints of their journal articles and papers on their own or departmental servers but do not deposit them in Cornell's Open Access Repository - eCommons because it is either too time consuming or because they do not know enough about their intellectual property right agreements with publishers to feel comfortable.

Harvesting from Web sites

The goal of this project is to develop a system to overcome these barriers by semi-automatically harvesting, cataloging, and collecting research publications that are already posted on departmental or individual web sites in the cornell.edu domain.

The system will crawl departmental and local servers looking for .pdf, .ps, and other files. It will:

  • Automatically generate metadata for each file.
  • Determine if it has been previously submitted to or published in a journal or conference (by searching Google, or a database such as Web of Science, Compendex, etc)
  • Determine the publisher's agreement for self archiving (by looking up in the Sherpa/Romeo database, http://www.sherpa.ac.uk/romeo.php), or searching for the journal's information that describes the author's rights.
  • Ask authors for permission to archive either in the open access portion of eCommons or in a restricted portion if the rights do not allow.

Drag and drop user interface

The goal of this project is a simple drag and drop interface for researchers to deposit their material into eCommons.

As with the harvesting project, this interface would provide a number of tools that minimize the amount of work that is required of the researcher, including automatic generation of metadata and standard processes for managing copyright permissions.

eCommons is currently implemented with the open source DSpace software. These tools would be contributions to the DSpace community. A drag and drop solution could be a world wide contribution to the DSpace user community.


[ CS 501 Home | Notices | Syllabus | Projects | Readings | Assignments | Quizzes | Academic Integrity | About ]


William Y. Arms
(wya@cs.cornell.edu)
Last changed: January 10, 2008