Fall 2017 CS6741: Structured Prediction for NLP

Time: Monday and Wednesday, 9:50-11:05am

Room: Gates 416 / Bloomberg 398

Instructor: Yoav Artzi, yoav@cs (office hours by appointment)

Class listing: CS6741

CMS, Piazza, Zoom (lectures), CMT (paper reviewing), AOI (topic voting)

All students are asked to bring their laptops to class. We will use laptops to broadcast content. If you can not bring a laptop, you will need to share a laptop with another student (this should work well for pairs). This applies to students on both campuses.

In this course, we will study various topics in NLP. We will focus on research results, and every few meetings switch topics. In general, most topic discussions will include technical overview, data analysis, a classical result, and recent results. Discussion of results will be done through research papers, and will include a class discussion and short paper reviews. We will use CMT to read and review papers. Each meeting will start with a 10-15min presentation followed by a discussion. In addition, we will dedicate a part of the semester to a deep dive into a single focus topic. This semester the focus topic will be reinforcement learning for NLP. The focused part of the semester will include, in addition to paper reviews and discussions, implementation and analysis of core algorithms.

We will use All Our Ideas to vote on topics. We will select topics based on votes and the general agenda of the course. There is no guarantee that the top-voted topic will be the next to be discussed (but it is very likely). Please vote a lot. The more you vote, the better the ranking is. The password for the website will be given during the first lecture.

We will continuously update this page.

Possible Topics (not exhaustive)

Tagging (e.g., part-of-speech, named-entities)
Dependency parsing
Reading comprehension
Semantic parsing (i.e., mapping language to formal representations)
Word embeddings and distributional semantics
Question answering
Discourse parsing
Chat bots
Constituency parsing
Language modeling
Machine translation
Semantic role labeling
Textual entailment
Sentiment analysis
Co-reference resolution
Lexical semantics
Vision+language (e.g., VQA, caption generation)
Information extraction
Time and event extraction
Math word problems
Generation
Dialog
Paraphrasing
Emergence of language in artificial agents
Ethics and biases in NLP models
Recurrent network models
Interactive NLP systems
Topic models
Grounded language generation
Language grounding
Graph-based representations in NLP
Your favorite topic - send it to us!

Schedule

Date	Topic	Readings	Presenter	Data	Optional Readings and Others
Aug 23	Introduction				Intro Questionnaire
Aug 28	Ethics	Caliskan et al. 2017	Max	word2vec [2]	Fast Company Bolukbasi et al. 2016
Aug 30	Ethics	Zhao et al. 2017	Andrew	ImSitu	Wired
Sep 4	No class - Labor day
Sep 6	No class - EMNLP
Sep 12	No class - EMNLP
Sep 14	No class - campus dedication
Sep 18	Guest talk: Felix Hill (Deep Mind)
Sep 20	Recurrent architectures	Linzen et al. 2016, Kuncoro et al. 2017	Alane (Linzen et al. 2016), Ryan (Kuncoro et al. 2017)
Sep 25	Recurrent architectures	Vaswani et al. 2017	Howard
Sep 27	Semantic parsing	Matuszek et al. 2011	Dipendra	GenX
Oct 2	Semantic parsing	Krishnamurthy et al. 2017	Skyler	WikiTableQuestions	Project abstracts due
Oct 4	Semantic parsing	Padmakumar et al. 2017	Valts	Experiment Logs
Oct 9	No class - holiday
Oct 11	Language+Vision	Kitaev and Klein 2017	Eyvind	Data
Oct 16	Language+Vision	Goyal et al. 2017, Agrawal et al. 2017	Trishala	VQA2
Oct 18	Grounded generation	FitzGerald et al. 2013	Zexi	GenX
Oct 23	Project proposal presentations				Project presentations due Oct 22
Oct 25	Project proposal presentations
Oct 30	Grounded generation	Mao et al. 2016	Ishaan	Google RefExp
Nov 1	Grounded generation	Wiseman et al. 2017	Esin	Boxscore
Nov 6	RL	Harrison et al. 2017	Yoav		Kaplan et al. 2017, Krening et al. 2016 Frogger
Nov 8	RL	Guu et al. 2017	Dipendra	SCONE
Nov 13	RL	Peng et al. 2017	Alane and Valts	Frames
Nov 15	RL	Fang et al. 2017	Skyler
Nov 20	RL	Nguyen et al. 2017	Ryan	Data
Nov 27	Project presentations, Class 9:50-12:00				Project presentation due Nov 26
Nov 29	No class				Project reports due Dec 12

Procedurals

Project: There are two project options: research and survey. If you choose the research option, you will do a research project (can be your own research if relevant – it probably is!). The survey option will require to write a survey paper for selected area in NLP. Both options will include: (a) proposal presentation, (b) final presentation, and © final report submission.

Auditing: Auditing is allowed and encouraged with instructor permission. It requires attending all classes, submitting reviews, presenting papers, and participating in the discussion. Auditing does not require completing the project or doing any of the project related presentations. The goal is to allow interested students to join while maintaining a lively and productive discussion group. If you want to audit, please email the instructor as soon as possible.

Repeat students: Students that already took this class in the past are not required to do the project part of the class.

Grading: The grade will include paper reviews, participation, project, and an intro questionnaire.

Participation of non-PhD students: If you are a master student or an advanced undergraduate student, and you wish to participate in the class, please email the instructor. Cornell Tech master students, please follow the application instructions emailed to you.

Policies are subject to change. If something is not clear, please contact the course staff.

Paper Reviewing Guidelines

Each paper review will require a short summary of the paper and the actual review. Some questions you may use to guide your review are (many others are valid too):

Did you like the paper? Did you find it interesting? Be honest! And explain your stand!
What are the most important things you learned from the paper? Why are they important?
Do the lessons learned generalize beyond the specific task? Do they promote our understanding of language? Do they contribute towards building an important system or application?
Is the experimental setup satisfying? Any experiments missing? Any obvious or important baseline missing? Is the ablation analysis sufficient?
If a theoretical analysis is included, do you find it satisfying? If none is included, is it missing?
Is the problem/approach well motivated?
Are you convinced by the results? Why?
Is the writing clear? Is the paper well structured?

Since this is not a real conference review, please also write what you learned form this paper and why, in your opinion, it was a good choice for reading (or why it was a bad choice). Reviews are due at 8pm the day before class.

Paper Presentation and Discussion Guidelines

Each meeting, if readings are discussed, one student will present the papers for 10-15 minutes. The presentation can use slides or can be just verbal. You should use data examples, if data is available to the topic, to illustrate your point. We will then go around the room and each student will contribute to the discussion.

Some suggested discussion question (not a comprehensive list):

Why is this problem/task important?
Why is this problem/task challenging?
What are the hard cases?
What are the easy cases?
Can you think of a simple baseline? How well will it perform?
Why are the models discussed in class and readings appropriate?
Do these models make assumptions that hurt performance? How much do these assumptions hurt?
Is there an upper bound on performance?
What assumptions are built into the empirical work? Do they introduce any limitation into the findings?

Data Analysis Guidelines

Pick at least 2-3 examples to discuss during your presentation in class. Examples should be prepared to display on screen. We will share your screen as necessary. Pick the examples to illustrate various aspects of the paper and task. The questions you should think about include (but not limited to):

Some suggested discussion question (not a comprehensive list):

What about the assumptions built into the annotation scheme? Any of them arbitrary?
Find an example that is particularly fascinating. Why is it interesting?

References

Recommended: Michael Collins, Notes on Statistical NLP (on Michael’s website)
Recommended: D. Jurafsky & James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, Second Edition, 2009. (J&M)
Recommended: Y. Goldberg, Neural Network Methods in Natural Language Processing, 2017. (available online within Cornell)
Optional: C.D. Manning & H. Schuetze, Foundations of Statistical Natural Language Processing, Cambridge: MIT Press, 1999. (M&S) (available online, free within the Cornell network)
Optional: P. Koehn, Neural Machine Translation (book chapter draft)

Short and Incomplete List of NLP Pointers

NLP Conferences and Journals

The main publication venues are ACL, NACCL, EMNLP, TACL, EACL, CoNLL, and CL. All the paper from these publications can be found in the ACL Anthology. In addition, NLP publications often appear in ML and AI conferences, including ICML, NIPS, ICLR, AAAI, IJCAI. A calendar of NLP events is available here, and ACL sponsored events are listed here.

Corpora and Other Data

Tagging

Part-of-speech Tags

Both parsing corpora below (PTB and UD) contain POS tags. Each parse tree contains POS tags for all leaf nodes. You can view a sample of the PTB in NLTK:

>> import nltk
>> print ' '.join(map(lambda x: '/'.join(x), nltk.corpus.treebank.tagged_sents()[0]))
Pierre/NNP Vinken/NNP ,/, 61/CD years/NNS old/JJ ,/, will/MD join/VB the/DT board/NN as/IN a/DT nonexecutive/JJ director/NN Nov./NNP 29/CD ./.
>> print ' '.join(map(lambda x: '/'.join(x), nltk.corpus.treebank.tagged_sents(tagset='universal')[0]))
Pierre/NOUN Vinken/NOUN ,/. 61/NUM years/NOUN old/ADJ ,/. will/VERB join/VERB the/DET board/NOUN as/ADP a/DET nonexecutive/ADJ director/NOUN Nov./NOUN 29/NUM ./.

The universal tag set is described here. The PTB tag set is described here.

Named Entity Recognition Data

The CoNLL 2002 shared task is available in NLTK:

>> import nltk
>> len(nltk.corpus.conll2002.iob_sents())
35651
>> len(nltk.corpus.conll2002.iob_words())
678377
>> print ' '.join(map(lambda x: x[0] + '/' + x[2], nltk.corpus.conll2002.iob_sents()[0]))
Sao/B-LOC Paulo/I-LOC (/O Brasil/B-LOC )/O ,/O 23/O may/O (/O EFECOM/B-ORG )/O ./O

CoNLL 2002 is annotated with the IOB annotation scheme and multiple entity types.

NYT Recipe Data

This is another example of tagging. The task is explained here, and the data release is described here.

Dependency Parsing

The Universal Dependencies (UD) project is publicly available online. The website includes statistics for all annotated languages. You can easily download v1.3 from here. UD files follow the simple CoNLL-U format.

Constituency Parsing

The Penn Treebank is available from the LDC You will find tgrep useful for quickly searching the corpus for patterns. NLTK can also be used to load parse trees. A few more browsers are available online.

Machine Translation

The WMT shared task from 2016 is a good source for newswire bi-text.

Textual Entailment

TE has been studied extensively for more than a decade now. Recently, SNLI has been receiving significant attention.

Reading Comprehension

Semantic Parsing

We will look at three data sets commonly used for semantic parsing:

GeoQuery: A natural language interface to a small US geography database. The original data is available here, and the original query language is described here. The data with lambda calculus logical forms is available here.
ATIS: A natural language interface for a flights database. The data is available from the LDC.
Navi: Instructional language for robot navigation. The original data is described here, but we recommend using the data here.

Word Analogy

MSR dataset: 8,000 analogy questions.
GOOGLE dataset: 19,544 analogy questions.
SEMEVAL dataset: 79 distinct relation types.

Question Answering

VQA
Paralex

Vision and Language

Online Demos, Systems, and Tools

If you encounter an interesting demo or system not listed here, please email the course instructor.

SystemT Information Extraction Framework (Online Course)
CoreNLP: POS tagging, NER, co-references, dependency parsing
word2vecpride: word2vec in Pride and Prejudice
word2vec Playground
UWTime
Parsey McParseface
Cornell SPF
UIUC NLP Demos
ExplosionAI Demos (NER, dependency parsing, word embeddings)

Deep Learning frameworks and tools: