Prerequisites All of the following: CS 2110
or equivalent programming experience (Python encouraged);
a course in artificial intelligence or any relevant subfield (e.g., NLP, information retrieval, machine learning,
Cornell CS courses numbered 47xx or 67xx);
proficiency with using machine learning tools
(e.g., fluency at training an SVM, comfort with assessing a classifier’s performance using cross-validation)
Enrollment Limited to [[PhD and [CS MS] students] who meet the prerequisites]. If you are interested in taking the class but do not belong to these categories, come to first day of class when enrolment will be discussed. Auditing (either officially or unofficially) is not permitted.
Piazza page Course announcements and Q&A/discussion site. Social interaction and all that, you know. (Access code provided on first day of classes.)
Overview of course schedule. Details subject to change. Full schedule is maintained on the main course webpage.
Pedagogical purpose
Course overview
A1: Pilot empirical study for a research idea based on provided datasets and readings.
# 2 - #3
Get-to-know-you exercises to get everyone familiar and comfortable with each other. A1 related discussions.
How to form research questions and quickly test their feasibility.
# 4 - #7
Lecture topics related to the A1 startup projects: Conversational Structure, Lingusitic Cues, Conversation-specific Phenomena.
Case studies to explore some topics and research styles find interesting.
Next block of meetings
Dicussion of proposed projects based on starter projects and on topical readings
Practice with fast research-idea generation. Feedback as to what proposals are most interesting, most feasible, etc.
Discussion of student project proposals, based on the readings for that class meeting. Each class meeting thus involves everyone reading at least one of the two assigned papers and posting a new research proposal based on the reading to Piazza.
Thoughtfulness and creativity are most important to , but take feasibility into account.
Next block of meetings
Lectures on, potentially, linguistic socialization, conversational failure, moderation, influence, persuasion, diffusion, discourse structure, advanced language modeling
Familiarity with foundational material: concepts and methodology.
Potentially some assignments based on the lectures.
Remainder of the course
Activities related to course projects
Development of a "full-blown" research project (although time restrictions may limit ambitions). For our purposes, "interesting" is more important than "thorough".
Some time in December (to be determined by the registrar): final project writeup due
Grading Of most interest to is productive research-oriented discussion participation (in class and on Piazza), interesting research proposals and pilot studies, and a good-faith final research project.
Academic Integrity Academic and scientific integrity compels one to properly attribute to others any work, ideas, or phrasing that one did not create oneself. To do otherwise is fraud.
We emphasize certain points here. In this class, talking to and helping others is strongly encouraged. You may also, with attribution, use the code from other sources. The easiest rule of thumb is, acknowledge the work and contributions and ideas and words and wordings of others. Do not copy or slightly reword portions of papers, Wikipedia articles, textbooks, other students' work, Stack Overflow answers, something you heard from a talk or a conversation or saw on the Internet, or anything else, really, without acknowledging your sources. See and for more information and useful examples.
This is not to say that you can receive course credit for work that is not your own — e.g., taking someone else's report and putting your name at the top, next to the other person(s)' names. However, violations of academic integrity (e.g., fraud) undergo the academic-integrity hearing process on top of any grade penalties imposed, whereas not following the rules of the assignment only risk grade penalties.
ACL anthology of all conferences, journals and workshops published under the aegis of the Association for Computational Linguistics; ACM digital library proceedings publication archive for WWW; AAAI proceedings archive for ICWSM
Bryan, Christopher J., Gregory M. Walton, Todd Rogers, and Carol S. Dweck. 2011. Motivating voter turnout by invoking the self. Proceedings of the National Academy of Sciences 108 (31): 12653-12656.
Chong, Dennis and James N. Druckman. 2007. Framing theory. Annual Review of Political Science 10:103–26.
Hopkins, Daniel J. 2017. The exaggerated life of death panels?
The limited but real influence of elite rhetoric in the 2009–2010
health care debate. Policital Behavior.
[official link]
["ungated" version]
Taraborelli, Dario and Giovanni Luca Ciampaglia. Beyond notability. Collective deliberation on content inclusion in Wikipedia. Second international workshop on quality in techno-social systems, pp. 122-125. [alt link]
Related quote: "There is no such thing as conversation. There are intersecting monologues, that's all".
Rebecca West's short story, "There is no conversation".
UBC BC3 Blog Corpus: 7000 blog conversations with user-labeled comments from 6 popular websites (Slashdot, Macrumors, AndroidCentral, Dailykos, BusinessInsider, TSN). Slashdot includes "Funny" tags.
CORPS: corpus of political speeches tagged with specific audience reactions, such as APPLAUSE or LAUGHTER.
Intelligence Square Debate Dataset a collection of public debates with metadata (audience voting results pre- and post-debate, and audience reaction markers)
Sep 17: Social aspects of linguistic coordination
Upcoming deadlines: A1 Part E and presentations due next week
Class images, links and handouts
Lecture references
Danescu-Niculescu-Mizil, Cristian, Lillian Lee, Bo Pang, and Jon Kleinberg.
Echoes of power: Language effects and power differences in social interaction.
WWW, pp. 699--708.
[ACM link]
paper "homepage" (paper, slides, data, etc.)]
Oct 8, Oct 10, Oct 17 : Project-inspiring discussions based on readings (A2): Moderation Bias, Longitudinal Development, Polarization, Code Switching.
Demszky, Dorottya, Nikhil Garg, Rob Voigt, James Zou, Jesse Shapiro, Matthew Gentzkow, and Dan Jurafsky. 2019. “Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings.”
Luu, Kelvin, Chenhao Tan, and Noah A Smith. 2019. “Measuring Online Debaters’ Persuasive Skill from Text over Time.”
Shen, Qinlan, and Carolyn Rose. 2019. “The Discourse of Online Content Moderation: Investigating Polarized User Responses to Changes in Reddit’s Quarantine Policy.”
Yoder, Michael, Shruti Rijhwani, Carolyn Rosé, and Lori Levin. 2017. “Code-Switching as a Social Act: The Case of Arabic Wikipedia Talk Pages.”
Additional related referenes on Piazza
Oct 21: From hypothesis to research: Second case study (Socialization)
Lim, Kenneth (who took this class!).
fightin-words 1.0.4.
Compliant with sci-kit learn and distributed by PyPI; borrows (with acknowledgment)
from Jack's version.
F. Jelinek, R.L. Mercer and S. Roukos. Principles of Lexical Language Modeling for Speech Recognition. Advances in Speech Signal Processing, S. Furui and J. Sondhi, Eds. M. Dekker Publishers, New York, NY 1991. Pp.651-700
Gale, William A. and Kenneth W. Church. 1994.What's wrong with adding one. Corpus-based Research Into Language: In Honour of Jan Aarts, pp. 189--200.
Code for generating the calendar formatting
adapted from the original versions created by
Andrew Myers. Portions of the content of this website and course were created by collaboration between Cristian Danescu-Niculescu-Mizil and Lillian Lee over multiple runnings of this course.