More and more of life is now manifested online, and many of the digital traces that are
     left by human activity are increasingly recorded in natural-language format.
     This research-oriented course examines the opportunities for natural language
     processing to contribute to the analysis and facilitation of socially embedded processes.
     Possible topics include analysis of online conversations, learning social-network structure,
     analysis of text in political or legal domains, review aggregation systems.

If you're looking for something other than lecture content and have javascript enabled, click on the appropriate tab above. The tabs may take a little time to come up.

Prerequisites, enrollment, related classes

Prerequisites All of the following: CS 2110 or equivalent programming experience; a course in artificial intelligence or any relevant subfield (e.g., NLP, information retrieval, machine learning, Cornell CS courses numbered 47xx or 67xx); proficiency with using machine learning tools (e.g., fluency at training an SVM, comfort with assessing a classifier’s performance using cross-validation)

Enrollment Limited to [[PhD and [CS MS] students] who meet the prerequisites]. Auditing (either officially or unofficially) is not permitted.

Related classes: see Cornell's NLP course list, plus GOVT 6461, Public Opinion [the 2012 syllabus, time/location/some material/paper coverage is different 2017fall] COMM 6750 Research methods for social networks and social media.

The homepage for the previous running of CS6742 may also be useful. Here is the list of all prior runnings: 2016 fall :: 2015 fall :: 2014 fall :: 2013 fall :: 2011 spring

Administrative info

CMS page http://cmsx.csuglab.cornell.edu. Site for submitting assignments, unless otherwise noted. You may find this graphically-oriented guide to common operations useful: see how to replace a prior submission (point 1), how to tell if CMS successfully received your files (point 2), how to form a group (point 4).

Course discussion site https://blogs.cornell.edu/nlpsoc2017fa (access restricted to enrolled students). Course announcements and Q&A/discussion site. Social interaction and all that, you know.

Office hours and contact info See Prof. Lee's homepage and scroll to the section on Contact and availability info.

Grading Of most interest to is productive research-oriented discussion participation (in class and/or on the course discussion site, interesting research proposals and pilot studies, and a good-faith final research project.

Academic Integrity Academic and scientific integrity compels one to properly attribute to others any work, ideas, or phrasing that one did not create oneself. To do otherwise is fraud.

Certain points deserve emphasis here. In this class, talking to and helping others is strongly encouraged. You may also, with attribution, use the code from other sources. The easiest rule of thumb is, acknowledge the work and contributions and ideas and words and wordings of others. Do not copy or slightly reword portions of papers, Wikipedia articles, textbooks, other students' work, Stack Overflow answers, something you heard from a talk or a conversation or saw on the Internet, or anything else, really, without acknowledging your sources. See "Acknowledging the Work of Others" in The Essential Guide to Academic Integrity at Cornell and http://www.theuniversityfaculty.cornell.edu/AcadInteg/ for more information and useful examples.

This is not to say that you can receive course credit for work that is not your own — e.g., taking someone else's report and putting your name at the top, next to the other person(s)' names. However, violations of academic integrity (e.g., fraud) undergo the academic-integrity hearing process on top of any grade penalties imposed, whereas not following the rules of the assignment “only” risks grade penalties.

Overall course structure

Lecture Agenda Pedagogical purpose Assignments
#1

Course overview

 

A1 released: pilot empirical study for a research idea based on the given readings.

#2 - #4

Lectures on topics related to the A1 readings

Case studies to explore some topics and research styles find interesting. Get-to-know-you exercises to get everyone familiar and comfortable with each other.

 
Next block of meetings

Dicussion of proposed projects based on the readings

Practice with fast research-idea generation. Feedback as to what proposals are most interesting, most feasible, etc.

Discussion of student project proposals, based on the readings for that class meeting. Each class meeting involves everyone reading at least one of the two assigned papers and posting a new research proposal based on the reading to the course discussion site.

Thoughtfulness and creativity are most important to , but take feasibility into account.

Next block of meetings

Lectures on, potentially, linguistic coordination, linguistic adaptation, influence, persuasion, diffusion, discourse structure, advanced language modeling.

Foundational material

Potentially some assignments based on the lectures.

Remainder of the course

Activities related to course projects

Development of a "full-blown" research project (although time restrictions may limit ambitions). For purposes, "interesting" is more important than "thorough".

 

Resources

 

Lectures

Note that assignments will remain visible even when details are hidden.
#1 Aug 22: Introduction

Assignments/announcements

  • Assignment A1: Pilot empirical research study. Note the first deadline (of several) on Friday Aug. 25.

Class images, links and handouts

Lecture references

#2 Aug 24: A1 inspiration: Overview of conversations

Class images, links and handouts

Gespraechsgemetzel
Image: photo of entry 106 of Ben Schott, Schottenfreude: German Words for the Human Condition (2013)

Lecture references

Other references

#3 Aug 29: More A1 inspiration: discussion and persuasion

Assignments/announcements

  • First time in the new room (Gates 344 breakout room)

Class images, links and handouts

Wondermark cartoon
Image credit: David Malki !, In which Debate is debated, Feb 21st, 2014.

Lecture references

#4 Aug 31: Linguistic coordination

Assignments/announcements

  • Upcoming deadlines (default - 5pm unless otherwise noted): Friday Sept. 1, 2:30pm; Monday Sept 4

Class images, links and handouts

Lecture references

#5 Sep 5: Real-time measurement of coordination; A1 check-ins

Assignments/announcements

  • Life can be easier:
    • View discussion-site comments in reverse-chronological order by clicking on the speech balloon in the top bar
    • Cornell's Passkey for accessing restricted content in your browser. (So I will stop posting Cornell-access-specific URLs.)
  • Remember what we talked about sharing on the course discussion site!

References

#6 Sep 7: Appointments (see email for signup link)
#7 Sep 12: A1 presentations
#8 Sep 14: News, influence and information propagation, part 1

Assignments/announcements

Class images, links and handouts


Image source: David Malki ! Wondermark 1209: Talk and Awe

Lecture references

Other references

#9 Sep 19: News, influence and information propagation, part 2

Assignments/announcements

Class images, links and handouts

  • ICWSM 2011 Spinn3r dataset: "386 million blog posts, news articles, classifieds, forum posts and social media content between January 13th and February 14th"

Lecture references

#10 Sep 21: Proposals discussion (A2)

Assignments/announcements

The readings

Class images, links and handouts


Image source: Dorothy Gambrell, Cat and Girl: Steal This Cat and Girl

Lecture references

Other references

#11 Sep 26: Words across space, community, and time

Assignments/announcements

Class images, links and handouts

Lecture references

Other references

#12 Sep 28: Proposals discussion (A3)

Assignments/announcements

A5, the final-project proposal assignment, has been released. Note the multiple phases and due-dates.

The readings

Class images, links and handouts

Image source: English Language & Usage Stack Exchange. Click through for some interesting answers!

Lecture references (thanks to everyone for these pointers!)

Other references

#13 Oct 3: (Misc.) topics and power

Assignments/announcements

In-class reminder: A5, the final-project proposal assignment, has been released. Note the multiple phases and due-dates.

Class images, links and handouts

Lecture references

#14 Oct 5: Proposals discussion (A4)

Assignments/announcements

The readings

Lecture references

Other references

Oct 10: No class — Fall Break
#15 Oct 12: Optional project-proposal appointments

Assignments/announcements

#16 Oct 17: What makes two sub-languages different?

Assignments/announcements

Class images, links and handouts

Image source: http://www.keepcalm-o-matic.co.uk/p/keep-calm-and-never-tell-me-the-odds-6/.

Lecture references

#17 Oct 19: How different are two language models?

Assignments/announcements

  • Reminder: Phase 3 of A5 due on Monday; sign up beforehand for and attend mandatory feasibility-check appointment on Tuesday.

Class images, links and handouts

Lecture references

#18 Oct 24: Feasibility-check appointments

Assignments/announcements

  • Only come to class during your scheduled appointment; see Phase 3 of A5.
#19 Oct 26: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#20 Oct 31: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#21 Nov 2: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#22 Nov 7: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#23 Nov 9: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#24 Nov 14: (Possibly no lecture)

Assignments/announcements

Class images, links and handouts

Lecture references

#25 Nov 16: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

#26 Nov 21: Lecture title

Assignments/announcements

Class images, links and handouts

Lecture references

Nov 23: No class — Thanksgiving Break
#27 Nov 28: (probably in-class presentations)

Assignments/announcements

Class images, links and handouts

References

#28 Nov 30: (probably in-class presentations)

Assignments/announcements

Class images, links and handouts

Lecture references

Dec 11, 4:30pm: Final project writeup due

Code for generating the calendar formatting adapted from Andrew Myers. Portions of the content of this website and course were created by collaboration between Cristian Danescu-Niculescu-Mizil and Lillian Lee over multiple runnings of this course.