CS6740/IS6300: Advanced Language Technologies
Thinking about taking this class? The info "above the horizontal line" on this page is meant to help you out!
(Evolving) thoughts on content
See the 1st-day slides
Since there may still be people "shopping" courses, here also is a publicly-viewable draft of the 2nd-day slides
This is a graduate-level class on advanced technologies for the computational treatment of information in human-language form. The learning outcomes are that on completing this course, students should be able to:
- Describe fundamental trends in modern natural-language processing (NLP)
- Summarize and critically analyze current NLP research papers
Given NLP's current state of flux, the instructor shares these learning goals as well! The course will thus be very reading- and discussion-based, where we are all comfortable talking together about what we don't
understand and teach other what we do
understand; and we will make some content decisions on the fly, as a small group. This has implications for:
Intended audience and enrollment restrictions:
- Only Ph.D. and M.S. students may enroll.
(To be clear, this means this course is not open to MEngs, MPSs, or undergraduates).
No students may audit.
- If you are looking for a lecture-based course giving an instructor-driven high-level, structured overview of NLP, CS6740 is probably not the right class for you. Consider instead these alternate spring 2023 courses: CS4300 Language and Information or CS4744 Computational Linguistics I.
- Additional prerequisites:
(1) CS 2110 or equivalent. (2) A course in artificial intelligence or any relevant subfield (e.g., NLP, information retrieval, machine learning, Cornell CS courses numbered 47xx or 67xx). (3) Proficiency with using machine learning tools (e.g., comfort with assessing a classifier's performance using cross-validation).
- Strongly recommended personal "outlook": willingness to engage and sometimes struggle with papers outside of one's comfort zone; collaboratively work through the technical content of a paper line-by-line; curiosity to independently explore literature outside one's area of expertise.
Workload and assessment
- In-class presentations (exact number depends on number of students enrolled and the difficulty of the papers we tackle). These may involve meeting with the instructor beforehand.
Participation in discussion, either during class meetings or offline
Midterm paper that reviews and critically analyzes the class material
Final paper that reviews and critically analyzes the class material
- For days where another student is presenting: all non-presenting students are expected to prepare for class by at least skimming the abstract and intro of the paper(s) to be presented
To receive an A: in your presentation, demonstrate active intellectual engagement with the papers you are required to present, beyond simply summarizing them; participate meaningfully in at least 85% of the discussions; in the midterm paper, demonstrate active intellectual engagement with the material in the course so far, including thoughtful synthesis of common trends and issues across multiple papers; in the final paper, do so across at least two subfields.
To receive a B: in your presentation, provide accurate summaries that cover almost all important points, but not too many non-key points, of the papers you are required to present; participate meaningfully in at least 75% of the discussions; in the midterm paper, provide accurate summaries that cover almost all important points, but not too many non-key points, and that describes a common trend or issue across multiple papers in an organized fashion; in the final paper, do so across at least two subfields.
To receive a C: in your presentation, for the papers you are required to present, provide accurate summaries that go beyond just essentially reiterating what is the introduction and the abstract; participate meaningfully in at least 50% of the discussions; in the midterm and final paper, provide accurate summaries of the material in the course so far that go beyond just essentially reiterating what is the introduction and the abstract.
To receive a D: provide accurate summaries of the papers you are required to present; participate meaningfully in at least 25% of the discussions; in the midterm paper, provide accurate summaries of the papers you presented and of one paper that you did not present; in the final paper, provide accurate summaries of the papers you presented and of two papers that you did not present.
To receive a grade of F: do not meet the requirements for a D.
It is expected that most or all students will be able to achieve an A. We do not anticipate awarding A+s, given the nature of the grading rubric above. (Focus on your own research, not trying to optimize your grade in this course.)
- Use of text generation/editing systems: For each component of the workload, the vast majority of the intellectual work must be originated by you, not by text generation systems. It is OK to use aids for writing fluency --- but note that writing fluency is not part of the assessment rubrics below anyway.
- Example of something that is allowed: you write the initial draft(s), review its contents and double-check with the original paper. You then use some form of text generation system to proofread and improve the flow. You do not use the system’s output to add extra content.
- Example of something that is definitely not allowed: You essentially use a text generation system to generate an early draft, even if you later post-edit and correct the output
- Example of something that is OK but requires special treatment: You start with the procedure in point 1. But, the system output includes good points that you hadn’t thought of before, or makes you realize that a point you had made isn’t quite right.
- You may include the new material and/or make appropriate edits, but you should mention what specific system(s) you used and what changes you made based on it.
- Attendance: Please attend all class meetings that you are reasonably able to.
If attendance isn’t a reasonable option for a given class meeting, please contact the instructor ahead of time, if possible, for planning purposes.
- Illness is always a valid reason to not attend and is not held against participation accounting.
- Deadlines: We do not have slip days, and there is no "you can submit late for a small penalty": you need to hit the deadlines. But if there are extenuating circumstances, please email the instructor and we can talk. (Still submit what you have before the deadline, so we have an indication of your progress at that point.)
- 4. SDS accommodations: The instructor(s) have online access to SDS letters regarding accommodations for exams and other course matters, and will honor these accommodations. As recommended by the SDS office, we do ask that for each deadline, you let the instructor know beforehand in a timely fashion whether you wish to apply your accommodations.
- 5. Academic integrity
Claiming the work of others as your own is intellectual fraud and a violation of academic integrity. To avoid this, always track and credit your sources appropriately.
Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. The Dean of the Faculty’s page has more information on Code and related procedures: https://theuniversityfaculty.cornell.edu/dean/academic-integrity/