The woman who wrote from Phoenix
after my reading there

to tell me they were all still talking about it

just wrote again
to tell me that they had stopped.

            — “Feedback”, Billy Collins, Horoscopes for the Dead

Judge not, lest ye be judged.  


Links are to paper “home pages” providing, when available, the paper itself in various formats, abstract, BibTeX entry, talk slides, code, data, informal explanations, etc.

For papers covered by ACM copyright, what is provided are my/my co-authors' version of the work, posted by permission of the ACM for your personal use, and not for redistribution or commercial use.

One could imagine idly speculating about "what is the citation statistic that makes people like oneself look the best"? (If you have never wondered this, you are better than I, Gunga Din.) One proposal: median citation count (taken over papers).

Subjects, in sort order for table below (subjects may not display for Internet Explorer 7 and below, but works for IE 8): general-audience papers | | analysis | | , inc. simplification | | , inc. paraphrasing and summarization | | | | reviews/pedagogy

Incidentally, my Erdös number is ≤ 3: L. Lee → N. Tishby → N. Linial → P. Erdos. And my Molotov cocktail number is ≤ 4: L. Lee → J. Kleinberg → C. Rackoff → CR's father-in-law → V. Molotov

(click here to sort;
1st=“general audience”)
Pub year & venue
(click here to sort)
Authors & title (click here for sample of older papers)
2017 WWW Liye Fu, Lillian Lee, and Cristian Danescu-Niculescu-Mizil
(C. D-N-M is the corresponding senior author)

When confidence and competence collide: Effects on online decision-making discussions
2017 WWW Jack Hessel, Lillian Lee, and David Mimno
Cats and captions vs. user characteristics and the clock: A time-controlled analysis of multimodal content
2016 NLP+journalism IJCAI wksp
(best paper award)
Liye Fu, Cristian Danescu-Niculescu-Mizil, and Lillian Lee
Tie-breaker: Using language models to quantify gender bias in sports journalism
2016 WWW Chenhao Tan, Vlad Niculae, Cristian Danescu-Niculescu-Mizil, and Lillian Lee
Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions.
(Analyzes the Change My View subreddit)
2016 WWW Isabel Kloumann, Chenhao Tan, Jon Kleinberg, and Lillian Lee
Internet collaboration on extremely difficult problems: Research versus Olympiad questions on the Polymath site.
2016 ICWSM Jack Hessel, Chenhao Tan, and Lillian Lee
Science, AskScience, and BadScience: On the coexistence of highly related communities
2016 presentation at Text as Data/arxiv 1612.06391 Chenhao Tan, and Lillian Lee
Talk it up or play it down? (Un)expected correlations between (de-)emphasis and recurrence of discussion points in consequential U.S. economic policy meetings.
2015 WWW Chenhao Tan and Lillian Lee
All Who Wander: On the Prevalence and Characteristics of Multi-community Engagement
2015 NIPS soc+info networks workshop (essentially unrefereed) Jack Hessel, Alexandra Schofield, Lillian Lee, and David Mimno
What do Vegans do in their Spare Time? Latent Interest Detection in Multi-Community Networks
2014 ACL Chenhao Tan, Lillian Lee and Bo Pang
The effect of wording on message propagation: Topic- and author-controlled natural experiments on Twitter
, 2014 ACL short Chenhao Tan and Lillian Lee
A corpus of sentence-level revisions in academic writing: A step towards understanding statement strength in communication
2013 WSDM (plenary; acceptance rate: 19% ) Lars Backstrom, Jon Kleinberg, Lillian Lee, Cristian Danescu-Niculescu-Mizil
Characterizing and curating conversation threads: Expansion, focus, volume, re-entry
2012 ACL Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, Lillian Lee
You had me at hello: How phrasing affects memorability
2012 WWW Cristian Danescu-Niculescu-Mizil, Lillian Lee, Bo Pang, Jon Kleinberg
Echoes of power: Language effects and power differences in social interaction
2012 Extra-propositional aspects of meaning wksp Eunsol Choi, Chenhao Tan, Lillian Lee, Cristian Danescu-Niculescu-Mizil, Jennifer Spindel
Hedge detection as a lens on framing in the GMO debates: A position paper
& 2011 KDD poster
(aggregate oral + poster acceptance rate: 17.5%)
Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, Ping Li
User-level sentiment analysis incorporating social networks
2011 Cognitive Modeling wksp Cristian Danescu-Niculescu-Mizil, Lillian Lee
Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs.
2010 ACL short Cristian Danescu-Niculescu-Mizil, Lillian Lee
Don't 'have a clue'? Unsupervised co-learning of downward-entailing operators
2010 NAACL short Mark Yatskar, Bo Pang, Cristian Danescu-Niculescu-Mizil, Lillian Lee
For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia
2010 TOIS/2005 SIGIR Oren Kurland, Lillian Lee
PageRank without hyperlinks: Structural re-ranking using links induced by language models.
reviews/pedagogy 2010 CRA-E report Lillian Lee
CS Department curriculum reform: "Vectors"
& 2009 WWW Cristian Danescu-Niculescu-Mizil, Gueorgi Kossinets, Jon Kleinberg, Lillian Lee
How opinions are received by online communities: A case study on helpfulness votes.
2009 NAACL Cristian Danescu-Niculescu-Mizil, Lillian Lee, Rick Ducott
Without a ‘doubt’? Unsupervised discovery of downward-entailing operators.
2009 TOIS/2004 SIGIR Oren Kurland, Lillian Lee
Corpus structure, language models, and ad hoc information retrieval.
& 2008 book Bo Pang, Lillian Lee
Opinion mining and sentiment analysis.
2008 COLING poster Bo Pang, Lillian Lee
Using very simple statistics for review search: An exploration.
2008 COLING poster Mohit Bansal, Claire Cardie, Lillian Lee
The power of negative thinking: Exploiting label disagreement in the min-cut classification framework.
reviews/pedagogy 2008 AAAI educ. symp. Eric Breck, David Easley, K.-Y. Daisy Fan, Jon Kleinberg, Lillian Lee, Jennifer Wofford, Ramin Zabih
A new start: Innovative introductory AI-centered courses at Cornell.
2007 SIGIR poster Lillian Lee
IDF revisited: A simple new derivation within the Robertson-Spärck Jones probabilistic model
2006 SIGIR Oren Kurland, Lillian Lee
Respect my authority! HITS without hyperlinks, utilizing cluster-based language models.
& 2006 EMNLP Matt Thomas, Bo Pang, Lillian Lee
Get out the vote: Determining support or opposition from Congressional floor-debate transcripts.
2005 ACL Bo Pang, Lillian Lee
Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales.
2005 SIGIR Oren Kurland, Lillian Lee, Carmel Domshlak
Better than the real thing? Iterative pseudo-query processing using cluster-based language models.
2004 NAACL
(best paper award)
Regina Barzilay, Lillian Lee
Catching the drift: Probabilistic content models, with applications to generation and summarization.
2004 ACL Bo Pang, Lillian Lee
A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts.
2004 Nat'l Academies Lillian Lee
"I'm sorry Dave, I'm afraid I can't do that": Linguistics, statistics, and natural language processing circa 2001.
2004 IBM summit Lillian Lee
A matter of opinion: Sentiment analysis and business intelligence (position paper).
2003 NAACL Regina Barzilay, Lillian Lee
Learning to paraphrase: An unsupervised approach using multiple-sequence alignment.
2002 JACM/1997 ACL Lillian Lee
Fast Context-Free Grammar Parsing Requires Fast Boolean Matrix Multiplication.
2002 EMNLP Bo Pang, Lillian Lee, Shivakumar Vaithyanathan
Thumbs up? Sentiment classification using machine learning techniques.
2002 EMNLP
(nominated for best paper)
Regina Barzilay, Lillian Lee
Bootstrapping lexical choice via multiple-sequence alignment.
reviews/pedagogy 2002 TeachNLP wksp Lillian Lee
A non-programming introduction to computer science via NLP, IR, and AI
2001 SIGIR
(nominated for best paper)
Rie Kubota Ando, Lillian Lee
Iterative residual rescaling: An analysis and generalization of LSI.
2001 AISTATS Lillian Lee
On the effectiveness of the skew divergence for statistical language analysis
2000 NAACL/2003 JNLE Rie Kubota Ando, Lillian Lee
Mostly-unsupervised statistical segmentation of Japanese: Applications to kanji
reviews/pedagogy 2000 CL review Lillian Lee
[review of] Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze
1999 ACL Lillian Lee
Measures of Distributional Similarity
1999 ACL Lillian Lee, Fernando Pereira
Distributional similarity models: Clustering vs. nearest neighbors
1999 MLJ Ido Dagan, Lillian Lee, Fernando Pereira
Similarity-based models of word cooccurrence probabilities
1997 ACL Ido Dagan, Lillian Lee, Fernando Pereira
Similarity-based methods for word sense disambiguation
1997 thesis Lillian Lee
Similarity-based approaches to natural language processing
1996 techrpt Lillian Lee
Learning of context-free languages: A survey of the literature.
1994 ACL Ido Dagan, Fernando Pereira, Lillian Lee
Similarity-based estimation of word cooccurrence probabilities
1993 ACL Fernando Pereira, Naftali Tishby, Lillian Lee
Distributional clustering of English words

[*]It should be pointed out that Latour continues: “However, stacking masses of reference is not enough to become strong if you are confronted with a bold opponent. On the contrary, it might be a source of weakness. If you explicitly point out the papers you attach yourself to, it is then possible for the reader — if there still are any readers — to trace each reference and to probe its degree of attachment to your claim.” (emphasis added)

The work described in the publications above was supported in part by the National Science Foundation under several grants (to see which grants supported a particular paper, please consult the acknowledgments of that publication). Any opinions, findings, and conclusions or recommendations expressed above are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Lillian Lee's home page
Lillian Lee's research summary
Cornell Natural Language Processing (NLP) group.