Skip to main content
more options
CS/INFO 431/631: Web Information Systems

Section and Reading Schedule

Topics covered in section will generally match or compliment the lecture material.  The table below will list specific section reading assignments as the semester progresses.  Readings for the week will be posted no later than Monday evening preceding the section.

Date Topic and Readings Notes
1/26 From libraries to the Web: points on a spectrum
2/2 Web Architecture and Information Organization For the Svenonius book you will need to use http://library.cornell.edu/ and search the library catalog.
2/9 Identifiers and Bibliographic Models In the FRBR report, you really need to only skim part 4 just to get the idea of attributes.
2/16 Data, Information Knowledge
  • M. J. Bates, Information and knowledge: an evolutionary framework for information science, Information Research, July, 2005. http://informationr.net/ir/10-4/paper239.html
  • R. Rao, From IR to Search, and Beyond, ACM Queue, 2(3), May 2004  (Licensed ACM, find through google scholar and access from cornell.edu)
  • M.K. Bergman, The Deep Web: Surfacing Hidden Value, Journal of Electronic Publishing, 7(1), August, 2001. (License), find through google scholar and access from cornell.edu). Read to text just beyond figure 2 (stop at header "Study Objectives")
For the Rao paper, try out some of the search engines linked to at the end of the paper.
2/23 Resource Description, Metadata, Cataloging Reaction Paper 1 due
3/2 Metadata Harvesting
  • Lagoze, C. and Van de Sompel, H., The Open Archives Initiative: Building a low-barrier interoperability framework. in Joint Conference on Digital Libraries, (Roanoke, VA, 2001).
  • Lagoze, C., Krafft, D., Cornwell, T., Dushay, N., Eckstrom, D. and Saylor, J. Metadata aggregation and "automated digital libraries": A retrospective on the NSDL experience. arXiv.org cs.DL/0601125, Cornell University, 2006, http://arxiv.org/abs/cs.DL/0601125.
 
3/9 Personal and Corporate Information
  • R. Mukherjee and J. Mao, Enterprise Search: Tough Stuff, ACM Queue, 2(2), April 2004.  Search in Google Scholar and access from Cornell domain.
  • E. Cutrell, S. Dumains, and J. Teevan, Searching to Eliminate Personal Information Management, Communications of the ACM, 49(1), January 2006, Search in Google Scholar and access from Cornell domain.
  • G. Bell, J. Gemmell, A Digital Life, Scientific American, 296(3), March 2007, Access through Cornell Library Gateway.
 
3/16 TBA  
3/23 Spring Break  
3/30

Web Scale Information Analysis

  • Page, Lawrence; Brin, Sergey; Motwani, Rajeev; Winograd, Terry, The PageRank Citation Ranking: Bringing Order to the Web., 1999 (find on Google Scholar)
  • S. R. Kumar, et. al., The web as a graph, presented at Nineteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Dallas, 2000, (find on Google Scholar: NOTE, THERE ARE SEVERAL PAPERS WITH SIMILAR NAMES, MAKE SURE YOU READ THE 2000 SIGMOD ONE!)
  • A. Heydon and M. Najork, Mercator: A Scalable, Extensible Web Crawler, World Wide Web, December, 1999, (find on Google Scholar)
 
4/6

Semantic Web Applications

 
4/13 Trust and Reputation
  • Gyongyi, Z. and Garcia-Molina, H., Web Spam Taxonomy. in First International Workshop on Adversarial Information Retrieval on the Web, (Chiba, Japan, 2005).
  • Lynch, C. A. (2001). "When Documents Deceive: Trust and Provenance as New Factors in Information Retrieval in a Tangled Web." Journal of the American Society of Information Science and Technology 52(1): 12-17, http://www.cs.ucsd.edu/~rik/others/lynch-trust-jasis00.pdf
  • Hirtle, P. B. (2000). Archival Authenticity in a Digital Age. Authenticity in a Digital Environment, Washington, D.C., Council on Library and Information Resources., http://www.clir.org/pubs/reports/pub92/hirtle.html.
 
4/20 Longevity of Digital Information
  • Hunter, J., & Choudhury, S. (2006). PANIC – An Integrated Approach to the Preservation of Complex Digital Objects using Semantic Web Services”, International Journal on Digital Libraries: Special Issue on Complex Digital Objects. International Journal on Digital Libraries, 6(2), 174-183.
  • Rosenthal, D.S.H., Robertson, T., Lipkis, T., Reich, V., & Morabito, S. (2005). Requirements for Digital Preservation Systems: A Bottom-Up Approach. D-Lib Magazine, 11(11).
  •  
 
4/27 Scholarly Communication
  • Liu, X., Bollen, J., Nelson, M.L. and Van de Sompel, H. Co-Authorship Networks in the Digital Library Research Community, arXiv, 2006.
  • H. Van de Sompel, S. Payette, J. Erickson, C. Lagoze, and S. Warner, "Rethinking Scholarly Communication: Building the System that Scholars Deserve," D-Lib Magazine, September 2004.
  • J. Fry, "Studying the Scholarly web: How disciplinary culture shapes online representation," Cybermetrics: International Journal of Scientometrics, Informetrics and Bibliometrics, vol. 10, 2006.
 
5/6 TBA  

Reading Guidelines

Readings assigned for sections come from three types of sources:

Students are expected to approach each week's readings critically. Are the ideas sound? What are the alternatives and trade-offs? How well do the ideas fit into the larger information context? What are the barriers to success: technical, social, legal, and economic? How is the content of the readings related to the topics presented in the recent lectures? Weekly sections are meant to be a forum for discussing these critical reactions, driven by student participation and NOT by instructor or teaching assistant presentations. The amount of section participation and the degree to which it represents critical evaluation of the readings is an important criteria of grading.

On-line questionnaire

Each week, students will need to complete a short set of questions about the readings. The questions which will be available via CMS each Wednesday evening. Completed questions will need to be submitted before the beginning of section each Friday (1:25 PM). The questions will be mainly short, designed to make sure that the assigned papers have been read. These questions will be graded.

Reaction Papers

There are three reaction papers due during the semester.  The tentative reaction paper due dates are February 23, April 16, and May 6 at 11:59PM.  The second reaction paper is optional and the final reaction paper will be the final exam.

Instructions for Reaction Papers 1 and 2

For each reaction paper you should choose a topic covered in the course thus far.  The notion of a "topic" is reasonably fuzzy but broadly it is something that you can use as a vehicle for framing a discussion about three papers.  Examples of topics are "Libraries in the digital age", "Information Interoperability", "Semantic Web", etc.  You should then choose two of the assigned readings from the course thus far that are related to your topic of choice.  Then choose a third related paper that you discover via another method such as references in the papers you have read, searching on Google, Google Scholar, via the library gateway, or from other information source. Think of finding this paper as a mini resource discovery exercise. Make sure to include proper citations to the three papers you have chosen. 

You should then write approximately 4-5 pages (approximately 2000-2500 words) in which you address the following points:

A few additional guidelines on the papers are:

The reaction papers will be graded on a 12 point scale, with points allocated in the following categories:

Papers will be submitted via CMS, and should be in word (doc) or pdf format.

Instructions for Reaction Paper 3 (Final Exam)

To be announced