Advanced Language Technologies, Fall 2019

Tuesdays and Thursday 1:25-2:40, Stimson G01 (Zoom link available on request)

This course covers selected advanced topics in natural language processing (NLP) and/or information retrieval, with a conscious attempt to avoid topics covered by other Cornell courses. Hence:

Prerequisites, enrollment, related classes

Prerequisites All of the following: CS 2110 or equivalent programming experience; a course in artificial intelligence or any relevant subfield (e.g., machine learning, NLP, information retrieval, Cornell CS courses numbered 47xx or 67xx); proficiency with using machine learning tools (e.g., fluency with training a classifier and assessing its performance using cross-validation).

Enrollment Enrollment is open on Student Center to PhD and MS students (although those who do not meet the prerequisites should not take this class).

Other students interested in gaining permission to enroll: please contact Prof. Lee after lecture on Tuesday, September 3rd. (Before that date, I won't have enough information on the number of students to be able to make enrollment allowances.) Try to attend the first two lectures if you can, but if you are shopping other courses meeting at the same time, it's OK to miss one or both of the first two CS6740 lecture times. You will be be responsible for making up the material on your own, but some form of notes or slides will be posted.

Auditing is an option for those permitted to enroll: the only requirement is to sign up on Student Center for the "Audit" option as Grade Basis, and there is no coursework or attendance requirement to earn the audit credit. Students already actively engaged in thesis research should thus choose the "Audit" grade basis.

Remote attendance is possible; please contact me for a Zoom link (contact information listed on the "Administrative info" tab).

Related classes See Cornell's NLP course list

Likely topics

Formal models of language, parsing complexity: Tree-adjoining grammar, and perhaps also combinatory categorial grammar Dependency parsing: Eisner's algorithm, Maximum-spanning tree Style Implication (De)constructing datasets Evaluation

Administrative info

Course homepage http://www.cs.cornell.edu/courses/cs6740/2019fa. Main site for course info, assignments, readings, lecture references, etc.

CMS page https://cmsx.cs.cornell.edu. Site for submitting assignments, unless otherwise noted. You may find this graphically-oriented guide to common operations useful: see how to replace a prior submission (point 1), how to tell if CMS successfully received your files (point 2), how to form a group (point 4).

Office hours and contact info See Prof. Lee's homepage and scroll to the section on "Contact and availability info".

Coursework

Resources

 

Lectures

Note that assignments will remain visible even when details are hidden.
#1 Aug 29: Introduction

Assignments/announcements

Class images, links and handouts

#2 Sep 3: Motivation for Tree Adjoining Grammars: introduction to sentential structure

Assignments/announcements

  • Those wishing to enroll but need a PIN: please email Prof. Lee with your name and netID by noon on Thursday if you can (by Tuesday evening is preferable)

Class images, links and handouts

Other references

#3 Sep 5: CFGs and long-distance dependencies; tree substitution grammars as a way to lexicalize CFGs

Assignments/announcements

  • Everyone (including auditors and those not yet enrolled): please complete the CS 6740 "administrative matters" quiz on CMS, https://cmsx.cs.cornell.edu, deadline Mon Sept 9, 11:59pm. Enrollment permissions will be decided in part by the information furnished as quiz answers.
    So, being on CMS does not mean you have been enrolled in the class!
    If you don't see "CS 6740" when you log in to CMS or can't log in, please email Prof. Lee with your name and netID.
  • Reading for today: Sections 3-4.1 of Aravind K. Joshi and Yves Schabes. 1991. Tree-adjoining grammars and lexicalized grammars. University of Pennsylvania Department of Computer and Information Science, Technical Report No. MS-CIS-91-22.
    We're reversing the order of presentation (as is done is Schabes' 1990 Ph.D. thesis, Mathematical and computational aspects of lexicalized grammars)
  • Reading for next week (don't get too hung up on the details):
  • Tentative sketch of first "real" assignment, due sometime between Sep 19 and 24: spend X hours (where I will specify X) implementing a representation of tree-adjoining grammars, allowing one to specify a TAG (that is, you should not hard-code a specific TAG), and, given a partial derivation tree (which you'll need to represent) and an elementary tree, determine whether the elementary tree can legally be substituted into by/adjoined into the corresponding derived tree. Write a description of your ideas and any challenges you faced. Be prepared to discuss your efforts in class.
    You may not arrive at a really functional implementation; I'm just looking for a good-faith effort.

Class images, links and handouts

Lecture references

#4 Sep 10: Tree grammars: tree substitution grammars and tree adjoining grammars

Assignments/announcements

  • Assignment 1 is due September 19 12:00 P.M. (noon), but you can continue resubmitting on CMS (Lillian will set up CMS by the night of September 11th) until noon Monday the 23rd. You should spend a minimum of 10 hours and a maximum of 13 hours coding by the September 19 deadline; you're not obligated to do any more coding after that. Along with a zip file of your code, submit an informal writeup (PDF) describing your design decisions.
    We'll discuss our experiences together on the lecture of Sep 24th.
    Please work by yourselves until the September 19th deadline; after that I'll open up some sort of discussion site to allow for collaboration.

Class images, links and handouts

#5 Sep 12: Tree adjunction

Assignments/announcements

  • No lecture Oct 3.

Class images, links and handouts

Lecture references

#6 Sep 17: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#7 Sep 19: Lecture title

Assignments/announcements

Class images, links and handouts

References

#8 Sep 24: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#9 Sep 26: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#10 Oct 1: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

Oct 3: No class — CIS 20th anniversary celebration
#12 Oct 8: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#13 Oct 10: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

Oct 15: No class — Fall Break
#14 Oct 17: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#15 Oct 22: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#16 Oct 24: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#17 Oct 29: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#18 Oct 31: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#19 Nov 5: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#20 Nov 7: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#21 Nov 12: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#22 Nov 14: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#23 Nov 19: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#24 Nov 21: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#25 Nov 26: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

Nov 28: No class — Thanksgiving Break
#26 Dec 3: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#27 Dec 5: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#28 Dec 10: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

Dec 18, 7:00pm: Final take-home exam due

Code for generating the calendar formatting adapted from Andrew Myers.