CS/INFO 431
Architecture of Web Information Systems
Spring 2006

Discussion Section and Readings

Readings

The subject of the course is a dynamic area.  Much of the material in the course is the result of recent research and implementation.  Fortunately almost all of this work is available through papers on the open-source Web.  Readings are assigned for each week's discussion section as listed in the schedule below.  When a specific URL is not listed with the paper, you should use Google Scholar to find the paper (in cases where the content is protected by licensing, you will need to do this from within the cornell.edu domain). 

Students are expected to approach each week's readings critically.  Are the ideas sound?  What are the alternatives and trade-offs?   How well do the ideas fit into the larger information context?  What are the barriers to success: technical, social, legal, and economic?  How is the content of the readings related to the topics presented in the recent lectures? Weekly sections are meant to be a forum for discussing these critical reactions, driven by student participation and NOT by instructor or teaching assistant presentations.  The amount of section participation and the degree to which it represents critical evaluation of the readings is an important criteria of grading. 

Date Topic and Readings
Section 1
1/27
From libraries to the Web: points on a spectrum
Section 2
2/3
Information modeling: the library catalog
Section 3
2/10
Identifiers
Section 4
2/17
Metadata
Section 5
2/24
Rich Information Models
Section 6
3/3
Metadata Harvesting and Digital Library Architecture
  • Lagoze, C. and Van de Sompel, H., The Open Archives Initiative: Building a low-barrier interoperability framework. in Joint Conference on Digital Libraries, (Roanoke, VA, 2001).
  • Shreeves, S., Knutson, E.M., Stvilia, B., Palmer, C.L., Twidale, M.B. and Cole, T.W., Is "Quality" Metadata "Shareable" Metadata? The Implications of Local Metadata Practices for Federated Collections. in ACRL Twelfth National Conference, (Minneapolis, 2005), ALA.
  • Lagoze, C., Krafft, D., Cornwell, T., Dushay, N., Eckstrom, D. and Saylor, J. Metadata aggregation and "automated digital libraries": A retrospective on the NSDL experience. arXiv.org cs.DL/0601125, Cornell University, 2006, http://arxiv.org/abs/cs.DL/0601125.

 

Section 7
3/10
Complex Digital Objects
  • R. Daniel Jr., C. Lagoze, and S. D. Payette, "A Metadata Architecture for Digital Libraries," presented at IEEE Forum on Research and Technology Advances, Santa Barbara, 1998.
  • J. Bakaert, P. Hochstenbach, H.Van de Sompel, "Using MPEG-21 DIDL to Represent Complex Digital Objects in the Los Alamos National Laboratory Digital Library", D-Lib Magazine, 9(11), 2003, http://www.dlib.org/dlib/november03/bekaert/11bekaert.html
  • [1] H. Van de Sompel, M. Nelson, C. Lagoze, and S. Warner, "Resource Harvesting within the OAI-PMH Framework," D-Lib Magazine, vol. 10, 2004.  http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html
Section 8
3/17
Motivating the Semantic Web
  • Heflin, J.D. Towards the Semantic Web: Knowledge Representation in a Dynamic, Distributed Environment Department of Computer Science, University of Maryland, College Park, MD, 2001. (Chapters 1,2) http://www.cse.lehigh.edu/~heflin/pubs/heflin-thesis-orig.pdf
  • Hendler, J. Agents and the Semantic Web, IEEE Intelligent Systems, March/April 2001
Section 9
3/31
Web Scale Information Analysis
Section 10
4/7
Semantic Web Applications
  • Huynh, D., Mazzocchi, S. and Karger, D., Piggy Bank: Experience the Semantic Web Inside Your Web Browser. in International Semantic Web Conference (ISWC), (2005).
  • Kahan, J., Koivunen, M.-R., Prud'Hommeaux, E., et al., Annotea: An Open RDF Infrastructure for Shared Web Annotations. in WWW10, (Hong Kong, 2001).
  • Karger, D. and Quan, D. What Would It Mean to Blog on the Semantic Web? Lecture Notes in Computer Science, 3298 (October). 214-228.
Section 11
4/14
Trust and Reputation
  • Gyongyi, Z. and Garcia-Molina, H., Web Spam Taxonomy. in First International Workshop on Adversarial Information Retrieval on the Web, (Chiba, Japan, 2005).
  • Lynch, C. A. (2001). "When Documents Deceive: Trust and Provenance as New Factors in Information Retrieval in a Tangled Web." Journal of the American Society of Information Science and Technology 52(1): 12-17, http://www.cs.ucsd.edu/~rik/others/lynch-trust-jasis00.pdf
  • Hirtle, P. B. (2000). Archival Authenticity in a Digital Age. Authenticity in a Digital Environment, Washington, D.C., Council on Library and Information Resources., http://www.clir.org/pubs/reports/pub92/hirtle.html.
Section 12
4/24
Longevity of Digital Information
  • Rosenthal, D.S.H., Robertson, T., Lipkis, T., et al. Requirements for Digital Preservation Systems: A Bottom-Up Approach. D-Lib Magazine, 11 (11).
  • Hunter, J. and Choudhury, S., A Semi-Automated Digital Preservation System based on Semantic Web Services. in JCDL, (Tucson, AR, 2005), ACM.
  • Rusbridge, C. Excuse Me... Some Digital Preservation Fallacies? Ariadne, February 2006 (46).
Section 13
5/1
Applications and wrap-up
  • H. Van de Sompel, S. Payette, et. al., "Rethinking Scholarly Communication: Building the System that Scholars Deserve", D-Lib Magazine, 10(9)
  • Dempsey, L. Libraries and the Long Tail - Some Thoughts about Libraries in the Network Age. D-Lib Magazine, 12 (4).
  • Lagoze, C., Krafft, D.B., Payette, S., et al. What Is a Digital Library Anymore, Anyway? Beyond Search and Access in the NSDL. D-Lib Magazine, 11 (11).

On-line questionnaire

Each week, students will need to complete a short set of questions about the readings.  The questions which will be available via CMS each Wednesday evening.  Completed questions will need to be submitted before the beginning of section each Friday (1:25 PM).  The questions will be mainly short, designed to make sure that the assigned papers have been read. These questions will be graded.

Reaction Papers

The reaction paper assignments are structured as follows: you should cover at least two closely related papers relevant to the current section of the course.  One of the papers should be from the course syllabus (assigned for discussion section on which the paper is due or the two preceding sections).  Another should be a related paper that you discover via another method such as references in the papers you have read, searching on Google, Google Scholar, via the library gateway, or from other information source.  Think of finding this paper as a mini resource discovery exercise.  The beginning of the reaction paper should include citations (with URLs) to the two papers you have chosen.

You should then write approximately 3-4 pages (approximately 1500-2000 words) in which you address the following points:

Reaction papers should not just be summaries of the papers you read; most of your text should be focused on synthesis of the underlying ideas, your own perspective on the papers, and thinking on how the content of the papers relates to the overall content of the course. Reaction papers should be done individually (i.e. not in groups). 

The reaction papers will be graded on a 12 point scale, with points allocated in the following categories:

Papers will be submitted via CMS, and should be in word (doc) or pdf format.

 

[CS/INFO 431 Home Page]

Carl Lagoze
(lagoze@cs.cornell.edu)
Last changed: April 25, 2006