CS 502 : Computing Methods for Digital Libraries

Spring 2001 -- Monday, Wednsday, Friday 02:30 - 03:20 PM

Professor Herbert Van de Sompel

3 Credit Hours

syllabus
assignments
code
home
 
administration
notices
background reading

Assignments

All assignments are to be submitted via e-mail to herbertv@cs.cornell.edu . The subject line of the e-mail MUST read: CS502, Assignment #n

Assignment #1, due 02/17 12:00 PM

Group assignement: Write a paper on the topic "Is the Web (as a whole) a Digital Library? "

For this assignement, you will be organized into groups of approximately 10 students. Each group delivers 1 co-authored paper. Get yourself organized in those groups; create a mailing list to discuss this question; get together in person for a discussion; create a draft document to be discussed; deliver a co-authored paper of maximum 6 pages. The groups are listed here.

1 person of each group e-mails the resulting paper (HTML format) to herbertv@cs.cornell.edu by 02/17 12:00 PM, with subject line: CS502, Assignment #1.

Grading will be based on structure of the paper, on the ideas/arguments that it introduces and whether these are relevant/convincing/original, not on conduct of the english language.

Tips:

Assignment #2, due 03/31 12:00 PM

Group assignement: Creation of composite digital objects

For this assignement, you will be organized into groups of approximately 6 students. Here are the groups.

Delivery: 1 person of each group sends an e-mail containing 2 PURL's to herbertv@cs.cornell.edu by 03/31 12:00 PM, with subject line: CS502, Assignment #2.

Starting point: Choose two pages with printed text, one english, one spanish. Make sure that the pages are different in nature.

Goal: Create two composite digital objects that contain various digital renderings of this source material. Use only computer-based techniques to create the renderings, i.e. never leave the digital domain once the printed material has been digitized.

Think Kahn-Willensky when structuring a rendering into a digital object and when structuring the various renderings into a composite digital object. Each composite digital object itself must be an HTML page accessible via a PURL [http://www.purl.org/]. The HTML page contains:

Each composite digital object needs to contain AT LEAST the following renderings:

But, be CREATIVE and try to think of other possible renderings of this source material.

When moving from one generation of digital rendering to another, do NOT correct the errors in the output of a digitization/conversion process. Document the problems that are being introduced by the digitization/conversion as part of the technical metadata.

Assignment #3, due 05/05 12:00 PM

For this assignement, you will be organized into groups of approximately 6 students. Here are the groups.

Goal: The creation of a formal mechanism that allows for the verification of the validity of Open Archives Metadata Harvesting (OAMH) protocol requests.

The current specifications of the OAMH protocol state that an HTTP Status-Code 400 should be returned when the syntax of the protocol request is illegal, i.e.:

From an exploration of the repositories that have implemented the OAMH protocol, it can be seen that the interpretation of the above rules is very diverse, inconsistent and not even compliant with the specifications. Therefore, it would be helpful to have a more formal definition for valid protocol requests. Such a definition should formally specify what the the family of protocol requests is that must retrun a valid OAMH protocol reply. With such a definition in place, repositories can consistenly reply:

This assignment is about establishing and testing such a formal definition for OAMH protocol requests by means of an XML Schema:

Example, the request:

http://an.oa.org/request?verb=ListRecords&from=12-01-1999&until=14-01-1999&set=theset&metadataPrefix=oai_dc

could be rendered into the following XML document::

<?xml version="1.0" encoding="UTF-8" ?>
<oai-request namespace_stuff schemalocation_stuff>
<ListRecords>
<from>12-01-1999</from>
<until>14-01-1999</until>
<metadataPrefix>oai_dc</metadataPrefix>
<set>theset</set>
</ListRecords>
</oai-request>

The conformance of such an instance document with the XML Schema specified by "namespace_stuff schemalocation_stuff" (which is the result of Task 2) can be tested using the XSV Schema validator.

Delivery: 1 person of each group sends an e-mail containing a URL to herbertv@cs.cornell.edu by 05/05 12:00 PM, with subject line: CS502, Assignment #3. Accessible from that URL must be:


Herbert Van de Sompel