![]() |
CS 5150
Software Engineering
Spring 2009
Project Suggestion:
Harvester for Cornell Research
|
Client John M. Saylor,
Cornell University Library Harvester for Cornell Research Objective eCommons@Cornell is a digital repository that is open to anyone affiliated with Cornell University. It is a place to capture, store, index, preserve and redistribute materials in digital formats for educational, scholarly, research or historical purposes (http://ecommons.library.cornell.edu/). Cornell faculty and researchers often post pre and post prints of their journal articles and papers on their own or departmental servers but do not deposit them in eCommons because it is inconvenient, or they are uncomfortable about their intellectual property right agreements with publishers. The goal of this project is to develop a system to overcome these barriers by semi-automatically harvesting, cataloging, and collecting research publications that are already posted on departmental or individual web sites in the cornell.edu domain. Harvesting from Web sites The system will crawl departmental and local servers looking for .pdf, .ps, and other files. It will:
Wider application eCommons is implemented with the open source DSpace software. These tools would be contributions to the DSpace community. |
[ Home | Notices | Syllabus | Projects | Readings | Assignments | Quizzes | Academic Integrity | About ]
William Y. Arms
(wya@cs.cornell.edu)
Last changed: January 6, 2009