Natural Language Processing


Spring 2016, CS 5740
Time: Tuesday and Thursday, 9:30-10:45am
Room: Grizzly
Instructor: Yoav Artzi (office hours: Tue, 3-4pm, Baron)
Teaching Assistant: Dipendra Misra (office hours: Fri, 3-4pm, Baron)
Contact: Piazza Discussion Group [join here]
Assignments and Submissions: CMS


This course constitutes an introduction to natural language processing (NLP), the goal of which is to enable computers to use human languages as input, output, or both. NLP is at the heart of many of today's most exciting technological achievements, including machine translation, automatic conversational assistants and Internet search. Possible topics include summarization, machine translation, sentiment analysis and information extraction as well as methods for handling the underlying phenomena (e.g., syntactic analysis, word sense disambiguation, and discourse analysis).

Lectures

Schedule and topics are subject to change.

Text Books and Readings

Course Procedures

All policies are subject to change.

Grading: Your grade will be determined by the assignments (85%) and participation (15%). The assignment portion of the grade will be equally divided between five programming assignments. Participation includes classroom, discussion board, and Slack participation. Participation in all avenues will count towards the grade. Of course, we would love for you to participate in all ways! But it's your choice where you feel most comfortable. Participation includes asking questions, posting articles, starting discussions, and helping fellow students in the course forums.

Collaboration: All assignments will be done in groups of two. Groups will be determined randomly by the course instructor and shuffled after each assignment. This means that you will have a different partner for every assignment. Depending on class parity, one group might include three students.

Programming Languages and External Code: The recommended language for the assignments is Java. For most assignment, we will provide extensive support code in Java only and encourage you to use it. However, you may choose to implement your assignments with Java or Python, and decide not to use the support code. If you use Python, naturally, you will not use the support code. If you choose not use the support code: (a) you will have to implement everything from scratch so we can run it with a vanilla Python/Java installation with no third party packages installed, (b) you must document your entire code to make it readable to us (documentation is also required when using the support code, but only to the parts you write), (c) the input and output must be identical to the support code implementation, (d) the evaluation output must be identical as well. Regardless of the language, you may not use NLTK, OpenNLP, Mallet, Stanford CoreNLP or any other ML or NLP framework when implementing the assignment. The goal is to implement algorithms, not to simply use an off the shelf solution.

Late Submissions: If there's a good reason why you can't submit on time, please email the course instructor to ask for extension with a detailed description of the reason. Decisions will be on case by case and previous requests of the group members will be taken into account. Don't wait for the last minute with such requests! In general, you should be able to finish the assignments on time, but we understand that special circumstances and emergencies do occur, even if rarely. Submissions past the deadline, without pre-authorization but within 4 days, will still be graded, but at a weight of 70% of the full grade. Meaning, the maximum grade will be 70/100. After four days, late submissions will not be considered.

Prerequisites: CS 2110 or equivalent and CS 4780, CS 4786, or CS 5785 with a grade of B or above. Auditing does not count. If you did not complete any of these classes or your grade is below B enrollment requires instructor permission. For any other questions regarding enrollment, show up in person on the first day of class. Personal questions will be addressed following the lecture.

Auditing: There is a limited number of auditing slots for the class on a first-come first-served basis. Auditing has no requirements or prerequisites. Contact Sarah to enroll to audit.