Lecture Notes and Assigned Readings

Date

Lecture Topic and Handouts

Readings

Assignments

       
Thur 8/27 Introduction to Human Language Technologies (.pdf)    
Tue 8/31 Text categorization (.pdf) Either (but not both) of:
(1) Fabrizio Sebastiani. Machine Learning in Automated Text Categorization, ACM computing surveys, 2002. (.pdf). Read up to and including Section 5.1; then intro to Section 6, and Section 6.2., OR
(2) Christopher D. Manning, Prabhakar Raghavan and Hinrich Schuetze. Introduction to Information Retrieval. Cambridge University Press. 2008. Read Chapter 13.1, 13.2 (skip 13.2.1); Chapter 14.1, 14.2, 14.3 (skip 14.3.1).

(2) is shorter than (1) but might be too difficult for people without any background in machine learning.
 
Thurs 9/2 Finish slides from Tues  
Tues 9/7 Joachims [slides] (claire) T. Joachims. Transductive Inference for Text Classification using Support Vector Machines, Proceedings of the International Conference on Machine Learning (ICML), 1999. (You can skim the theorems and section 4.1 if those parts are difficult.)  
Tues 9/14 Yang and Pedersen [slides] (Claire)
Pang, Lee & V. [slides]. (Lu)
(1) Yang, Y. and Pedersen, J. O. 1997. A comparative study on feature selection in text categorization, Proceedings of ICML-97, 14th International Conference on Machine Learning.

(2) Bo Pang, Lillian Lee, Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. EMNLP.
 
Thurs 9/16 Word Sense Disambiguation (.pdf) J&M 19-19.3; 20-20.5  
Tues 9/21 Forman [slides] (Steve)
Lewis and Catlett (Bishan)
(1) Forman, George. Tackling Concept Drift by Temporal Inductive Transfer, Proceedings of SIGIR, 2006.

(2) D. Lewis and J. Catlett, ICML 1994. Heterogeneous Uncertainty Sampling for Supervised Learning. ICML, 1994.
 
Thurs 9/23 Mihalcea (Kaylin)
Mihalcea (Ryan)
(1) R. Mihalcea. Instance-based Learning with Automatic Feature Selection Applied to Word Sense Disambiguation. Coling, 2002.

(2) R. Mihalcea. Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. HLP/EMNLP, 2005.
 
Tues 9/28 Akkaya et al. (Yin)
Santamaria et al. (Pracheer)
(1) C. Akkaya, J. Wiebe and R. Mihalcea. Subjectivity Word Sense Disambiguation, EMNLP, 2009.

(2)C. Santamaria, J. Gonzolo, and J. Artiles. Wikipedia as Sense Inventory to Improve Diversity in Web Search Results. ACL, 2010.
 
Thurs 9/30 Information Extraction and Named Entity Identification (.pdf) J&M 22-22.2; 22.4 PROJECT PROPOSAL due FRI 10/1 via CMS.
Describe the problem or application to be addressed; the method or approach you'll employ; what data or corpora you'll use or create; evaluation plans.
Tues 10/5 IE methods (.pdf) and Intro to Relation Extraction (.pdf) J&M 19.4, 20.9, 22.2  
Thurs 10/7

Tues 10/12

Fall Break

RELATED WORK SUMMARY due FRI 10/15 via CMS.
Write up short descriptions of relevant previous work in the context of your project. Focus on organizing the related work into coherent sets rather than just describing one paper after another. In addition, make clear how your work differs from (or in what respects it is the same as) the previous work.
Tues 10/19 Banko and Etzioni (Jeremy) (1) M. Banko and O. Etzioni. The tradeoffs between open and traditional relation extraction. ACL, 2008.  
Thurs 10/21 Pradhan et al. (Scott)
Swier and Stevenson (Wenlei)
(1) S. Pradhan; W. Ward; K. Hacioglu; J. Martin; D. Jurafsky. Semantic Role Labeling Using Different Syntactic Views. ACL, 2005.

(2) R. Swier and S. Stevenson. Unsupervised Semantic Role Labelling. EMNLP, 2004.
PROGRESS REPORT due FRI 10/22 via CMS.
Report progress w.r.t. coding, corpus creation, results, etc. Include a plan (with dates) for how you will complete the project.
Tues 10/26 Conner et al. (David)
Surdeanu et al. (Stephen)
(1) M. Connor, Y. Gertner, C. Fisher and D. Roth. Starting from Scratch in Semantic Role Labeling. ACL, 2010.

(2) Surdeanu et al. Using Predicate-Argument Structures for Information Extraction. ACL, 2003.
 
Thurs 10/28 Summarization J&M 23.3-23.5, 23.7

E. Hovy. Automated Text Summarization. 2005. In R. Mitkov (ed), The Oxford Handbook of Computational Linguistics, p. 583-598. Oxford: Oxford University Press.
 
Tues 11/2 Choi et al. (Karan)
Choi et al. (Ami)
(1) Y. Choi, C. Cardie, E. Riloff, S. Patwardhan. Identifying Sources of Opinions with Conditional Random Fields and Extraction Patterns. EMNLP, 2005.

(2) Y. Choi, E. Breck, C. Cardie. Joint Extraction of Entities and Relations for Opinion Recognition. EMNLP, 2006.
 
Thurs 11/4 Hassan et al. (Uday)
Haghighi and Vanderwende (Karthik)
(1) A. Hassan, D. Radev, J. Cho, and A. Joshi. Content based recommendation and summarization in the blogosphere. ICWSM, 2009.

(2) A. Haghighi and L. Vanderwende. Exploring Content Models for Multi-Document Summarization. HLT-NAACL, 2009.
INITIAL RESULTS due FRI 11/05 via CMS.
Describe any results you've obtained thus far.
Tues 11/9 Discourse Analysis  
Thurs 11/11 Bengtson and Roth (Ben)
Haghighi and Klein (Rob)
(1) E. Bengtson and D. Roth. Understanding the Value of Features for Coreference Resolution. EMNLP, 2008.

(2) A. Haghighi and D. Klein. Coreference Resolution in a Modular, Entity-Centered Model. HLT-NAACL, 2010.
 
Tues 11/16 Barzilay and Lapata (Shenwei)
Pitler et al. (Robert)
(1) R. Barzilay and M. Lapata. Modeling Local Coherence: An Entity-Based Approach. ACL, 2005.

(2) E. Pitler, A. Louis, and A. Nenkova. Automatic Sense Prediction for Implicit Discourse Relations in Text. ACL, 2009.
 
Thurs 11/18 Sauper et al. (Akshay)
Roth et al. (Ruben)
(1) C. Sauper, A. Haghighi, and R. Barzilay. Incorporating Content Structure into Text Analysis Applications. EMNLP, 2010.

(2) D. Roth, M. Sammons and V. Vydiswaran. A Framework for Entailed Relation Recognition. ACL, 2009.
Tues 11/23 Project presentations  
Thurs 11/25
Thanksgiving Break

 
Tues 11/30 Project presentations INITIAL DRAFT of final paper/report due TUES 11/30 via CMS.
 
Thurs 12/2 Project presentations