More and more of life is now manifested online, and many of the digital traces that are left by human activity are increasingly recorded in natural-language format. This research-oriented course examines the opportunities for natural language processing to contribute to the analysis and facilitation of socially embedded processes. Possible topics include analysis of online conversations, learning social-network structure, analysis of text in political or legal domains, review aggregation systems. CDNM's web page CDNM's web page

No tab selected

(If you're looking for anything other than lecture contents and have javascript enabled, click on the appropriate tab above.)

Prerequisites, course selection, enrollment

Prerequisites All of the following: CS 2110 or equivalent programming experience; a course in artificial intelligence or any relevant subfield (e.g., NLP, information retrieval, machine learning); proficiency with using machine learning tools (e.g., fluency at training an SVM, knowledge of how to assess a classifier’s performance using cross-validation)

Enrollment CS/IS PhD students may enroll online. Other students interested in adding the course, (wel)come to the first day of class. Enrollment questions will be addressed then, when we have a better sense of what the demand is and how many CS/IS PhD students are interested in taking the class.

Choosing among NLP courses: How do I know which one is right for me?

In 2016-2017, we are blessed with a plethora of NLP-related offerings!

At the graduate level:

For undergraduate courses on offer, consult the Cornell NLP course list.

For more information before classes begin The webpage of the previous running (Fall 2015) of this course gives a general idea of what the course will be like

Administrative info and overall course structure

Course homepage http://www.cs.cornell.edu/courses/cs6742/2016fa. Main site for course info, assignments, readings, lecture references, etc.; updated frequently.

CMS page http://cms.csuglab.cornell.edu. Site for submitting assignments, unless otherwise noted.

Piazza page http://piazza.com/cornell/Fall2016/cs6742 Course announcements and Q&A/discussion site. Social interaction and all that, you know.

Contacting the instructor

Overview of course schedule. Details subject to change. Full schedule is maintained on the main course webpage.

Lecture Agenda Pedagogical purpose Assignments
#1

Course overview

 

Pilot empirical study for a research idea based on readings provided.

#2 - #4

Lecture topics related to the A1 readings: Online reviews: individual expression, community dynamics; Online asynchronous conversations.

Case studies to explore some topics and research styles find interesting. Get-to-know-you exercises to get everyone familiar and comfortable with each other.

 
Next 6 meetings, not counting presentations or discussions

Lectures on, potentially, linguistic coordination, linguistic adaptation, influence, persuasion, diffusion, discourse structure, advanced language modeling

Foundational material

Potentially some assignments based on the lectures.

Next large block of meetings

Dicussion of proposed projects based on the readings

Practice with fast research-idea generation. Feedback as to what proposals are most interesting, most feasible, etc.

Discussion of student project proposals, based on the readings for that class meeting. Each class meeting thus involves everyone reading at least one of the two assigned papers and posting a new research proposal based on the reading to Piazza.

Thoughtfulness and creativity are most important to , but take feasibility into account.

Remainder of the course

Activities related to course projects

Development of a "full-blown" research project (although time restrictions may limit ambitions). For our purposes, "interesting" is more important than "thorough".

 

Some time in December (to be determined by the registrar): final project writeup due

Grading Of most interest to is productive research-oriented discussion participation (in class and on Piazza), interesting research proposals and pilot studies, and a good-faith final research project.

Academic Integrity Academic and scientific integrity compels one to properly attribute to others any work, ideas, or phrasing that one did not create oneself. To do otherwise is fraud.

We emphasize certain points here. In this class, talking to and helping others is strongly encouraged. You may also, with attribution, use the code from other sources. The easiest rule of thumb is, acknowledge the work and contributions and ideas and words and wordings of others. Do not copy or slightly reword portions of papers, Wikipedia articles, textbooks, other students' work, Stack Overflow answers, something you heard from a talk or a conversation or saw on the Internet, or anything else, really, without acknowledging your sources. See http://www.cs.cornell.edu/courses/cs6742/2011sp/handouts/ack-others.pdf and http://www.theuniversityfaculty.cornell.edu/AcadInteg/ for more information and useful examples.

This is not to say that you can receive course credit for work that is not your own — e.g., taking someone else's report and putting your name at the top, next to the other person(s)' names. However, violations of academic integrity (e.g., fraud) undergo the academic-integrity hearing process on top of any grade penalties imposed, whereas not following the rules of the assignment only risk grade penalties.

Resources

 

Lectures

Note that assignments will remain visible even when details are hidden.
#1 Aug 23: Course overview: scope, course goals, course design
  • Details will be appear here before each lecture.
Assignments/announcements:
  • Assignment A1 released
  • Student-information assignment released: see handout

Class images, links and handouts

Datasets

References


Code for generating the calendar formatting adapted from the original versions created by Andrew Myers.