CS 501
Software Engineering
Spring 2007

Project Suggestion: Automatic Input to DSpace


CS 501 Home

Syllabus

Projects

Books and Readings

Assignments

Quizzes

Academic Integrity


About this site

 

Client

John Saylor, Cornell University Library , jms1@cornell.edu.

Automatic Input to DSpace

Cornell faculty and staff regularly post pre- and post-prints of their journal articles and papers on their own or departmental servers but are reluctant to deposit them in Cornell University Library's Open Access Repository (http://dspace.library.cornell.edu). This reluctance is either because the process is too time consuming or because they are concerned about their intellectual property right agreements with publishers to feel comfortable doing this.

The goal of this project is to develop a system to overcome these barriers by semi automatically harvesting, cataloging, and collecting Cornell faculty publications for the Open Access Repository.The system would perform an analysis of what currently exists and where, what the rights issues are and also be a simple way for authors to submit their material.

The input is either: a crawl of departmental and local servers, looking for .pdf or .ps files; or a submission where the author drops a document into a Web form.

  • For each paper, the system then initiates the following automatic steps:
  • Generate metadata for each .pdf or.ps file
  • Determine if it has been previously submitted to or published in a journal or conference (by searching Google or another database such as Web of Science, etc.)
  • Determine the publisher's agreement for self archiving (by looking up in the Sherpa/Romeo database http://www.sherpa.ac.uk/romeo.php) or searching for the journal's information that describes what the authors rights are.
  • Build a database of these author-publisher agreements for each journal or publisher, similar to Sherpa.

Authors whose material is crawled are asked for permission to archive either the open access portion of DSpace or a restricted portion if the rights do not allow.

Authors who submit their materials via the drag and drop are notified of their rights according to the publishers agreement.

 

 


[ CS 501 Home | Notices | Syllabus | Projects | Readings | Assignments | Quizzes | Academic Integrity | About ]


William Y. Arms
(wya@cs.cornell.edu)
Last changed: January 18, 2007