"jhessel@"+".".join(["cs", "cornell", "edu"])
--- Office hours: M 4:30-6:30PM Gates G17PhD TA Xanda Schofield "xanda@"+".".join(["cs", "cornell", "edu"])
--- Office hours: W 2:30-4:30pm Gates G21
Good Python programming skills and familiarity with IPython Notebooks, of which we’ll make extensive use.
Date | Lecture | Agenda | Assignments | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Thu, Jan 28, 2016 |
#1 |
Lecture topics:Intro: Dimensions of Information Systems Conversational Behavior and Social information Related material:NPR story: Before The Internet, Librarians Would 'Answer Everything' — And Still Do References:Cristian Danescu-Niculescu-Mizil, Lillian Lee, Bo Pang and Jon Kleinberg. Echoes of power: Language effects and power differences in social interaction. Cristian Danescu-Niculescu-Mizil, Michael Gamon and Susan Dumais. Mark my words! Linguistic style accommodation in social media. Proceedings of WWW, 2011. Kate G. Niederhoffer and James W. Pennebaker. Linguistic Style Matching in Social Interaction. Journal of Language and Social Psychology 2002 21: 337. |
Assignment 1 out [Description, ZIP] First part due on Thursday Feb 4 at noon. Second part due on Thursday Feb 11 at noon. You might want this pickle |
||||||||
Tue, Feb 2, 2016 |
#2 |
Lecture topics:Text similarity measures: Minimum Edit Distance Edit Distance worksheet (includes sketch of the Wagner Fisher algorithm we used in class) Related material:Readings:J&M Chapters 3.11 |
|||||||||
Th, Feb 4, 2016 |
#3 |
No lecture |
Second part of Assignment 1 due on Thursday Feb 11 at noon. You might want this pickle |
||||||||
Tu, Feb 9, 2016 |
#4 |
Basic text processing concepts: Sentence Splitting, Word Tokenization, Types, Tokens Text similarity measures: Overlap, Jaccard similarity Classic (ad hoc) information retrieval systems Vector space model: binary representation In-class demo: Proto Information Retrieval System: IPython notebook and html Vector space model cheatsheet (useful to keep track of notation) Related material:Readings:J&M Chapters 3.8 and 23.1.1 |
|||||||||
Th, Feb 11, 2016 |
#5 |
Lecture topics:Vector Space Model: geometric intuition Cosine similarity Inverse document frequency (IDF) TF-IDF weighting Pivot length normalization In-class demo: (continued and updated) IPython notebook and html Readings:MRS Chapters 6.2, 6.3, 6.4.1 and 6.4.4 |
|||||||||
Th, Feb 18, 2016 |
#6 |
Lecture topics:Assigment 1 discussion Inverted Index Posting merge algorithm Boolean search In-class demo: (continued and updated) IPython notebook and html Related Material:Readings:MRS Chapter 1 |
Assigment 2 out [Description, ZIP] Due: Wednesday, March 2, 11:59pm |
||||||||
Tu, Feb 23, 2016 |
#7 |
Lecture topics:Efficient cosine similarity scoring using the inverted index Fast cosine retrieval worksheet (includes sketch of the algorithm using the inverted index) In-class demo: (continued and updated) IPython notebook and html Before optimizing retrieval with inverted indexes (one query on a collection of 40,000 reality TV utterances): After optimizing retrieval with inverted indexes (one query on a collection of 40,000 reality TV utterances): Readings:MRS Chapter 6.3.3 |
|||||||||
Th, Feb 25, 2016 |
#8 |
Lecture topics:Evaluation of ranked retrieval systems: Precision@k, Precision-recall curve Search at Facebook (Guest speaker Ves Stoyanov) Thinking about evaluation metrics worksheet In-class demo: IPython notebook and html Readings:MSR Chapter 8 |
|||||||||
Tu, Mar 1, 2016 |
#9 |
Lecture topics:Evaluation of ranked retrieval systems: Mean Average Precision, Pooling, K-statistic Relevance feedback In-class demo: IPython notebook and html Related material Readings:MSR Chapters 8, 9.1 |
|||||||||
Th, Mar 3, 2016 |
#10 |
Lecture topics:Rocchio's method for query rewriting, Pseudo Relevance feedback Query expansion, Coocurence matrix Query update using relevance feedback worksheet (includes the Rocchio query update rule) Readings:MSR Chapter 9 |
Assigment 3 out [Description, ZIP] Due: Wednesday, March 9, 11:59pm Midterm date: March 15, durring class time |
||||||||
Tu, Mar 8, 2016 |
#11 |
Lecture topics:Term-document matrix recap, Pointwise Mutual Information Readings:MSR Chapter 9 |
|||||||||
Th, Mar 10, 2016 |
#12 |
Lecture topics:Wrapping up ad hoc IR |
|||||||||
Tu, Mar 15, 2016 |
#13 |
Lecture topics:In-class midterm |
|||||||||
Th, Mar 17, 2016 |
#14 |
Lecture topics:Midterm discussion Project discussion |
Due date 1 (piazza): Monday, March 21 at midnight Due date 2 (CMS): Thursday, March 23 at midnight |
||||||||
Tu, Mar 22, 2016 |
#15 |
Lecture topics:Text Mining |
|||||||||
Th, Mar 24, 2016 |
#16 |
Lecture topics:Practical text mining (by Xanda Schofield) |
|||||||||
Tu, Apr 5, 2016 |
#17 |
Lecture topics:Text mining, naive bayes, generative models |
|||||||||
Th, Apr 7, 2016 |
#18 |
Lecture topics:One on one project meetings |
|||||||||
Tu, Apr 12, 2016 |
#19 |
Lecture topics:Feature selection, Conversational Features, Ordinal ranking |
|||||||||
Th, Apr 14, 2016 |
#20 |
Lecture topics:Practical unsupervised learning on textual data: SVD (by Jack Hessel) In-class demo: IPython notebooks Data exploration, LSI |
|||||||||
Tu, Apr 19, 2016 |
#21 |
Lecture topics:Practical unsupervised learning on textual data: Latent semandic indexing and topic modeling (by Jack Hessel) In-class demo: IPython notebooks Kickstarter success prediction Related material:Indexing by latent semantic analysis. Deerwester, Dumais and Harshman 1990 |
|||||||||
Th, Apr 21, 2016 |
#22 |
Lecture topics:Opinions and Trust: Link Analisys, Hubs and Authorities, Spectral Analysis Related material:NetworkX python package for link analysis Reading: |
|||||||||
Tu, Apr 26, 2016 |
#23 |
Project Prototype Madness (project links posted on Piazza): fun, fun, fun! |
|||||||||
Th, Apr 28, 2016 |
#24 |
Opinions and Trust: Sentiment Analyis , Lexicon Expansion Related Material: |
On Piazza, April 23 |
||||||||
Tu, May 3, 2016 |
#25 |
Project presentations |
|||||||||
Wed, May 5, 2016 |
#26 |
Project presentations |
|||||||||
Th,May Apr 10, 2016 |
#27 |
Lecture topics:Opinions and Trust: Review Helpfulness, Social Aspects of Helpfulness Evaluation, Deception Analysis Course wrap-up Related Material:Fin. |