Course homepage Main site for course info, assignments, readings, lecture references, etc.; updated frequently.
Course CMS page Site for submitting assignments, unless otherwise noted.
Course Piazza page Course announcements and Q&A/discussion site. Social interaction and all that, you know.
Instructor Professor Lillian Lee. For contact info, see
Time and place Tuesdays and Thursdays, 10:10-11:25, Hollister 401 (since this room has reconfigurable seating) Gates Hall 344 breakout room (quietly enter through 344, since students are working there, an go to the room on the right).
This page last modified Fri September 19, 2014 4:40 PM.

Brief course description More and more of life is now manifested online, and many of the digital traces that are left by human activity are increasingly recorded in natural-language format. This research-oriented course examines the opportunities for natural language processing to contribute to the analysis and facilitation of socially embedded processes. Possible topics include sentiment analysis, learning social-network structure, analysis of text in political or legal domains, review aggregation systems, analysis of online conversations, and text categorization with respect to psychological categories.

Prerequisites As previously announced in the 2014-2015 Courses of Study, enrollment is limited to PhD students except by permission of instructor . August 14 addition: given the number of PhD students who have registered for credit, permission will not be granted to non-PhD students, and auditing will not be allowed. Required background: CS 2110 or equivalent programming experience, and at least one course in artificial intelligence or any relevant subfield (e.g., NLP, information retrieval, machine learning).

Related courses In Fall 2014, there's CS4744 Computational linguistics, CS6783 Machine learning theory, CS6788/INFO 6150 Advanced topic modeling, ECE 5960 Graphical models, IS 6320 Games, economic behavior, and the Internet. In Spring 2015, there's CS 4740 Natural language processing, and new IS professor Cristian Danescu-Niculescu-Mizil may be offering a course quite similar to CS6742.

Informative links


Quick links: overview | reviews, helpfulness, social interaction | what do conversations "look" like? | discourse

Lecture Date Agenda and references Assignments and other handouts
#1 Aug 26

Course overview: scope, course goals, course design

The school of Athens - people talking and reading

Image source: Some people are speaking to each other; some are reading and perhaps being influenced by that text; some are writing text, perhaps hoping to have an effect on others; some texts are being read by several people simulataneously.

Scan of lecture notes

Images and webpages displayed in class:


Bryan, Christopher J, Gregory M Walton, Todd Rogers, and Carol S Dweck. 2 August 2011. Motivating voter turnout by invoking the self. Proceedings of the National Academy of Sciences 108 (31): 12653-12656.

Chong, Dennis and James N. Druckman. 2007. Framing theory. Annual Review of Political Science 10:103--126.

Assignment 1 (A1) officially released
#2 28

To what extent is there social interaction on review sites?

Image source: Dorothy Gambrel, Cat and Girl: Permission policy here.

Scan of lecture notes

Images and webpages displayed in class:


Danescu-Niculescu-Mizil, Cristian, Robert West, Dan Jurafsky, Jure Leskovec, and Christopher Potts. 2013. No country for old members: User lifecycle and linguistic change in online communities. Proceedings of WWW, pp. 307--318.

Gilbert, Eric and Karrie Karahalios. 2010. Understanding deja reviewers. Proceedings of CSCW, pp.225—228. [ACM link]

Michael, Loizos and Jahna Otterbacher. 2014. Write like I write: Herding in the language of online reviews. Proceedings of ICWSM.

Mimno, David. Data carpentry. 2014.

Pinch, Trevor and Filip Kesler. 2011. How Aunt Ammy gets her free lunch: A study of the top-thousand customer reviewers at

#3 Sep 2

Review "quality" and "helpfulness": a lens for studying social influence

Image source: Randall Munroe, xkcd (click on image for original link). Expletive obscured in this presentation.

Scan of lecture notes

Images and handouts from class

References on lecture topics

Cheng, Justin, Cristian Danescu-Niculescu-Mizil, and Jure Leskovec. 2014. How community feedback shapes user behavior. Proceedings of ICWSM.

Danescu-Niculescu-Mizil, Cristian, Gueorgi Kossinets, Jon Kleinberg, and Lillian Lee. 2009. How opinions are received by online communities: A case study on helpfulness votes. Proceedings of WWW: 141—150. [alt link]

Ghose, Anindya and Panagiotis Ipeirotis. 2011. Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics. IEEE Transactions on Knowledge and Data Engineering 23(10): 1498—1512. Official link can be found through Worldcat, e.g., here.

Muchnik, Lev, Sinan Aral, and Sean Taylor. 2013. Social influence bias: A randomized experiment. Science 341.

Otterbacher, Jahna. 2009. 'Helpfulness' in online communities: a measure of message quality. Proceedings of CHI, 955-964.

Sipos, Ruben, Arpita Ghosh, and Thorsten Joachims. 2014. Was this review helpful to you? It depends! Context and voting patterns in online content. Proceeedings of WWW.

Wang, R.Y. and Strong, D.M. Beyond accuracy: what data quality means to data consumers. Journal of Management Information Systems 12, 4 (1996), 5-34.

Representative additional references on "unconventional" text classification, by popular demand

Davidov, Dmitry, Oren Tsur, and Ari Rappoport. 2010. Semi-supervised recognition of sarcastic sentences in Twitter and Amazon. Proceedings of the Fourteenth Conference on Computational Natural Language Learning, pp. 107--116.

Kiddon, Chloé and Yuriy Brun. That's what she said: Double entendre classification. Proceedings of the ACL (short papers), 89--94.

Li, Jiwei, Myle Ott, Claire Cardie, and Eduard Hovy. 2014. Towards a general rule for identifying deceptive opinion spam. Proceedings of the ACL. The paper showing a learned classifier outperforming humans on Tripadvisor-style reviews is Ott, M, Y Choi, C Cardie, and J T Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. Proceedings of the ACL, pp. 309--319.

Mihalcea, Rada and Carlo Strapparava. 2006. Learning to laugh (automatically): Computational models for humor recognition. Computational Intelligence 22(2).

#4 4

What do conversations "look" like?

Scan of lecture notes

Aside: email corpora


Backstrom, Lars, Jon Kleinberg, Lillian Lee, and Cristian Danescu-Niculescu-Mizil. 2013. Characterizing and curating conversation threads: Expansion, focus, volume, re-entry. Proceedings of WSDM, pp. 13–22. [alt link]

Elsner, Micha and Eugene Charniak. September 2010. Disentangling chat. Computational Linguistics 36(3): 389-409. [data and code]

Gonzalez-Bailon, Sandra, Andreas Kaltenbrunner, and Rafael E Banchs. 2010. The structure of political discussion networks: A model for the analysis of online deliberation. Journal of Information Technology 25(2): 230-243.

Kumar, Ravi, Mohammad Mahdian, and Mary McGlohon. 2010. Dynamics of conversations. Proceedings of KDD, pp. 553--562.

Nguyen, Viet-An, Jordan Boyd-Graber, Philip Resnik, Deborah A Cai, Jennifer E Midberry, and Yuanxin Wang. 2014. Modeling topic control to detect influence in conversations using nonparametric topic models. Machine Learning 95:381--421. [alt link]. [The talk slides we looked at in class]

Prabhakaran, Vinodkumar, Ashima Arora, and Owen Rambow. 2014. Power of confidence: How poll scores impact topic dynamics in political debates. ACL joint workshop on social dynamics and personal attributes.

Prabhakaran, Vinodkumar and Owen Rambow. 2014. Predicting power relations between participants in written dialog from a single thread. Proceedings of the ACL (short papers).

Seo, Jangwon, W. Bruce Croft, and David A. Smith. 2009. Online community search using thread structure. Proceedings of CIKM, pp. 1907--1910.

Siersdorfer, Stefan, Sergiu Chelaru, Jose San Pedro, Ismail Sengor Altingovde, and Wolfgang Nejdl. July 2014. Analyzing and mining comments and comment ratings on the social web. ACM Trans. Web 8 (3): 17:1-17:39. [alt link]

Wang, Yi-Chia, Mahesh Joshi, and Carolyn Penstein Rosé. 2008. Investigating the effect of discussion forum interface affordances on patterns of conversational interactions. Proceedings of CSCW, pp. 555–558.

#5 9

Checkpoints of A1 projects; Discourse phenomena: clues regarding structure

Image source: "The image is one for which Picasso did a number of variations in Paris during the autumn–winter of 1912; in each version, a tall bottle and goblet are set out on a small round table."

Scan of lecture notes and the handout

References related to the A1 project discussions


References from discourse lecture

Grice, H.P. 1975. Logic and Conversation. In Syntax and semantics 3: Speech Acts, pp. 41-58.

Jurafsky, Dan, and Martin, James H. 2009. Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition Second edition. Chapter 21 covers discourse.

Moser, Megan and Johanna Moore. Toward a synthesis of two accounts of discourse structure. Computational Linguistics 22(3):409--419.

Rogers, Todd and Michael I Norton. June 2011. The artful dodger: Answering the wrong question the right way. Journal of Experimental Psychology: Applied 17 (2).

References for the examples on the handout:

Jordan Boyd-Graber Google+ post

Allen, James. 1995. Natural Language Understanding. Benjamin/Cummings Pub Co. Second ed.

Hirst, Graeme. 1981. Anaphora in Natural Language Understanding: A Survey. Lecture Notes in Computer Science. Springer, Berlin.

Sidner, Candace Lee. 1979. Towards a computational theory of definite anaphora comprehension in English discourse. MIT AITR-537.

Wilks, Yorick. 1975. An intelligent analyzer and understander of English. Communications of the ACM 18 (5): 264-274.


#6 11

Attention, intentions, and discourse structure: the Grosz and Sidner theory

Scan of lecture notes

Grosz, Barbara J., and Sidner, Candace L. 1986. Attention, intentions, and the structure of discourse. Computational Linguistics 12(3): 175-204.

Mann, William C., and Thompson, Sandra A. 1988. Rhetorical structure theory: Toward a functional theory of text organization. Text: Interdisciplinary Journal for the Study of Discourse 8, no. 3: 243-281.

Pinker, Steven and the Royal Society for the Encouragement of Arts, Manufactures and Commerce (RSA) Animate, posted to YouTube on Feb 10, 2011. Language as a Window into Human Nature

A2 out (deadline subsequently extended to Sept. 22)
#7 16 A1 presentations, part one  
#8 18 A1 presentations, part two  
#9 23 discuss A2  
#10 25    
#11 30    
#12 Oct 2    
#13 7    
#14 9    
Oct 14 Fall Break
#15 16    
#16 21    
#17 23    
#18 28    
#19 30    
#20 Nov 4    
#21 6    
#22 11    
#23 13    
#24 18    
#25 20    
#26 25    
Nov 27 Thanksgiving Break
#27 Dec 2    
#28 4    
Final-project due-date, as determined by the reigstrar: December 11 at 4:30 pm

Code for generating the calendar above and css was (barely) adapted from the original versions created by Andrew Myers.