Dean E. Murphy: ... your volume of work is rather modest. Why do you not write more?
Wisława Szymborska: ... You see, I also have this wastebasket.

            — L.A. Times interview, translated from the Polish by Ela Kasprzycka.
                Thanks to Reanna Esmail, Cornell reference assistant, for tracking down the source.


The woman who wrote from Phoenix
after my reading there

to tell me they were all still talking about it

just wrote again
to tell me that they had stopped.

            — “Feedback”, Billy Collins, Horoscopes for the Dead


Links are to paper “home pages” providing, when available, the paper itself in various formats, abstract, BibTeX entry, talk slides, code, data, informal explanations, etc. For papers covered by ACM copyright, what is provided are my/my co-authors' version of the work, posted by permission of the ACM for your personal use, and not for redistribution or commercial use.

One could imagine idly speculating about "what is the citation statistic that makes people like oneself look the best"? (If you have never wondered this, you are better person than I am.) One proposal: median citation count per paper; ACM, notably, reports the average citation per article..

Subjects, in sort order for table below: | analysis | | | | | , inc. simplification| | , inc. paraphrasing and summarization | | | general-audience papers | reviews/pedagogy

Incidentally, my Erdös number is ≤ 3: L. Lee → N. Tishby → N. Linial → P. Erdos. And my Molotov cocktail number is ≤ 4: L. Lee → J. Kleinberg → C. Rackoff → CR's father-in-law → V. Molotov

(click here to sort;
1st=“general audience”)
Pub year & venue
(click here to sort)
Authors & title Arguably lesser-known favorites
(click here to surface)
2023 ACL
(best paper award)
Jack Hessel, Ana Marasović, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff, Yejin Choi
Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest
[ACL Anthology, github, pre-conference 6-min talk, best-paper award talk (Jack starts speaking at about 15:06)]
2022 SIGHUM Workshop Ana Smith, Lillian Lee
War and Pieces: Comparing Perspectives About World War I and II Across Wikipedia Language Communities
2021 ACL Tianze Shi, Lillian Lee
Transition-based bubble parsing: Improvements on coordination structure prediction
2021 NAACL Tianze Shi, Ozan Irsoy, Igor Malioutov, Lillian Lee
Learning syntax from naturally-occurring bracketings
2021 IWPT Shared Task
(Top system overall and top on 16 of 17 languages)
Tianze Shi, Lillian Lee
TGIF: Tree-Graph Integrated-Format Parser for Enhanced UD with Two-Stage Generic- to Individual-Language Finetuning
2021 SocialNLP Karen Zhou, Ana Smith, Lillian Lee
Assessing cognitive linguistic influences in the assignment of blame
2020 EMNLP Jack Hessel and Lillian Lee
Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!
2020 Findings of ACL: EMNLP Tianze Shi, Chen Zhao, Jordan Boyd-Graber, Hal Daumé, and Lillian Lee,
On the potential of lexico-logical alignments for semantic parsing to SQL queries
2020 ACL Tianze Shi, Lillian Lee
Extracting headless MWEs from dependency parse trees: parsing, tagging, and joint modeling approaches
[ACL anthology site, including talk video]
2019 CSCW Kumar Bhargav Srinivasan, Cristian Danescu-Niculescu-Mizil, Lillian Lee, Chenhao Tan
Content removal as a moderation strategy: Compliance and other outcomes in the ChangeMyView community
2019 EMNLP Jack Hessel, Lillian Lee, and David Mimno
Unsupervised discovery of multimodal links in multi-image, multi-sentence documents
, 2019 NAACL Jack Hessel, Lillian Lee
Something's brewing! Early prediction of controversy-causing posts from discussion features
2018 EMNLP Tianze Shi, Lillian Lee
Valency-augmented dependency parsing
[ACL anthology site, including talk video, although due to travel issues the talk was presented by Xiang Yu]
2018 ACL Carlos Gómez-Rodríguez, Tianze Shi, Lillian Lee
Global transition-based non-projective dependency parsing
, 2018 NAACL Jack Hessel, David Mimno, Lillian Lee
Quantifying the visual concreteness of words and topics in multimodal datasets
2018 NAACL short Tianze Shi, Carlos Gómez-Rodríguez, Lillian Lee
Improving coverage and runtime complexity for exact inference in non-projective transition-based dependency parsers
2017 EMNLP Tianze Shi, Liang Huang, Lillian Lee
Fast(er) exact decoding and global training for transition-based dependency parsing via a minimal feature set
[ACL anthology site, including talk video]
2017 WWW Liye Fu, Lillian Lee, Cristian Danescu-Niculescu-Mizil
(C. D-N-M is the corresponding senior author)

When confidence and competence collide: Effects on online decision-making discussions
2017 WWW Jack Hessel, Lillian Lee, David Mimno
Cats and captions vs. creators and the clock: Comparing multimodal content to context in predicting relative popularity
2016 NLP+journalism IJCAI wksp
(best paper award)
Liye Fu, Cristian Danescu-Niculescu-Mizil, Lillian Lee
Tie-breaker: Using language models to quantify gender bias in sports journalism
2016 WWW Chenhao Tan, Vlad Niculae, Cristian Danescu-Niculescu-Mizil, Lillian Lee
Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions.
(Analyzes the Change My View subreddit)
2016 WWW Isabel Kloumann, Chenhao Tan, Jon Kleinberg, Lillian Lee
Internet collaboration on extremely difficult problems: Research versus Olympiad questions on the Polymath site.
2016 ICWSM Jack Hessel, Chenhao Tan, Lillian Lee
Science, AskScience, and BadScience: On the coexistence of highly related communities
2016 presentation at Text as Data/arxiv 1612.06391 Chenhao Tan, Lillian Lee
Talk it up or play it down? (Un)expected correlations between (de-)emphasis and recurrence of discussion points in consequential U.S. economic policy meetings.
2015 WWW Chenhao Tan, Lillian Lee
All Who Wander: On the Prevalence and Characteristics of Multi-community Engagement
2015 NIPS soc+info networks workshop Jack Hessel, Alexandra Schofield, Lillian Lee, David Mimno
What do Vegans do in their Spare Time? Latent Interest Detection in Multi-Community Networks
2014 ACL Chenhao Tan, Lillian Lee, Bo Pang
The effect of wording on message propagation: Topic- and author-controlled natural experiments on Twitter
, 2014 ACL short Chenhao Tan, Lillian Lee
A corpus of sentence-level revisions in academic writing: A step towards understanding statement strength in communication
2013 WSDM Lars Backstrom, Jon Kleinberg, Lillian Lee, Cristian Danescu-Niculescu-Mizil
Characterizing and curating conversation threads: Expansion, focus, volume, re-entry
2012 ACL Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, Lillian Lee
You had me at hello: How phrasing affects memorability
2012 WWW Cristian Danescu-Niculescu-Mizil, Lillian Lee, Bo Pang, Jon Kleinberg
Echoes of power: Language effects and power differences in social interaction
2012 Extra-propositional aspects of meaning wksp Eunsol Choi, Chenhao Tan, Lillian Lee, Cristian Danescu-Niculescu-Mizil, Jennifer Spindel
Hedge detection as a lens on framing in the GMO debates: A position paper
& 2011 KDD poster Chenhao Tan, Lillian Lee, Jie Tang, Long Jiang, Ming Zhou, Ping Li
User-level sentiment analysis incorporating social networks
2011 Cognitive Modeling wksp Cristian Danescu-Niculescu-Mizil, Lillian Lee
Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs.
2010 ACL short Cristian Danescu-Niculescu-Mizil, Lillian Lee
Don't 'have a clue'? Unsupervised co-learning of downward-entailing operators
2010 NAACL short Mark Yatskar, Bo Pang, Cristian Danescu-Niculescu-Mizil, Lillian Lee
For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia
2010 TOIS/2005 SIGIR Oren Kurland, Lillian Lee
PageRank without hyperlinks: Structural re-ranking using links induced by language models.
reviews/pedagogy 2010 CRA-E report Lillian Lee
CS Department curriculum reform: "Vectors"
& 2009 WWW Cristian Danescu-Niculescu-Mizil, Gueorgi Kossinets, Jon Kleinberg, Lillian Lee
How opinions are received by online communities: A case study on helpfulness votes.
2009 NAACL Cristian Danescu-Niculescu-Mizil, Lillian Lee, Rick Ducott
Without a ‘doubt’? Unsupervised discovery of downward-entailing operators.
2009 TOIS/2004 SIGIR Oren Kurland, Lillian Lee
Corpus structure, language models, and ad hoc information retrieval.
& 2008 book Bo Pang, Lillian Lee
Opinion mining and sentiment analysis.
2008 COLING poster Bo Pang, Lillian Lee
Using very simple statistics for review search: An exploration.
2008 COLING poster Mohit Bansal, Claire Cardie, Lillian Lee
The power of negative thinking: Exploiting label disagreement in the min-cut classification framework.
reviews/pedagogy 2008 AAAI educ. symp. Eric Breck, David Easley, K.-Y. Daisy Fan, Jon Kleinberg, Lillian Lee, Jennifer Wofford, Ramin Zabih
A new start: Innovative introductory AI-centered courses at Cornell.
2007 SIGIR poster Lillian Lee
IDF revisited: A simple new derivation within the Robertson-Spärck Jones probabilistic model
2006 SIGIR Oren Kurland, Lillian Lee
Respect my authority! HITS without hyperlinks, utilizing cluster-based language models.
& 2006 EMNLP (Listed by Paper Digest in Feb 2021 as the #2 most influential paper of that conference) Matt Thomas, Bo Pang, Lillian Lee
Get out the vote: Determining support or opposition from Congressional floor-debate transcripts.
2005 ACL (Listed by Paper Digest in Feb 2021 as the #3 most influential paper of that conference) Bo Pang, Lillian Lee
Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales.
2005 SIGIR Oren Kurland, Lillian Lee, Carmel Domshlak
Better than the real thing? Iterative pseudo-query processing using cluster-based language models.
2004 NAACL
(Best paper award. Listed by Paper Digest in Feb 2021 as the #4 most influential paper of that conference)
Regina Barzilay, Lillian Lee
Catching the drift: Probabilistic content models, with applications to generation and summarization.
2004 ACL (Listed by Paper Digest in Feb 2021 as the #1 most influential paper of that conference) Bo Pang, Lillian Lee
A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts.
2004 Nat'l Academies Lillian Lee
"I'm sorry Dave, I'm afraid I can't do that": Linguistics, statistics, and natural language processing circa 2001.
2004 IBM summit Lillian Lee
A matter of opinion: Sentiment analysis and business intelligence (position paper).
2003 NAACL (Listed by Paper Digest in Feb 2021 as the #5 most influential paper of that conference) Regina Barzilay, Lillian Lee
Learning to paraphrase: An unsupervised approach using multiple-sequence alignment.
2002 JACM/1997 ACL Lillian Lee
Fast Context-Free Grammar Parsing Requires Fast Boolean Matrix Multiplication.
2002 EMNLP (ACL test-of-time award for all *ACL 2002-2012 conferences. Listed by Paper Digest in Feb 2021 as the #1 most influential paper of that conference) Bo Pang, Lillian Lee, Shivakumar Vaithyanathan
Thumbs up? Sentiment classification using machine learning techniques.
2002 EMNLP
(nominated for best paper)
Regina Barzilay, Lillian Lee
Bootstrapping lexical choice via multiple-sequence alignment.
reviews/pedagogy 2002 TeachNLP wksp Lillian Lee
A non-programming introduction to computer science via NLP, IR, and AI
2001 SIGIR
(nominated for best paper)
Rie Kubota Ando, Lillian Lee
Iterative residual rescaling: An analysis and generalization of LSI.
2001 AISTATS Lillian Lee
On the effectiveness of the skew divergence for statistical language analysis
2000 NAACL/2003 JNLE Rie Kubota Ando, Lillian Lee
Mostly-unsupervised statistical segmentation of Japanese: Applications to kanji
reviews/pedagogy 2000 CL review Lillian Lee
[review of] Foundations of Statistical Natural Language Processing by Christopher D. Manning and Hinrich Schütze
1999 ACL Lillian Lee
Measures of Distributional Similarity
1999 ACL Lillian Lee, Fernando Pereira
Distributional similarity models: Clustering vs. nearest neighbors
1999 MLJ Ido Dagan, Lillian Lee, Fernando Pereira
Similarity-based models of word cooccurrence probabilities
1997 ACL Ido Dagan, Lillian Lee, Fernando Pereira
Similarity-based methods for word sense disambiguation
1997 thesis Lillian Lee
Similarity-based approaches to natural language processing
1996 techrpt Lillian Lee
Learning of context-free languages: A survey of the literature.
1994 ACL Ido Dagan, Fernando Pereira, Lillian Lee
Similarity-based estimation of word cooccurrence probabilities
1993 ACL Fernando Pereira, Naftali Tishby, Lillian Lee
Distributional clustering of English words

[*]It should be pointed out that Latour continues: “However, stacking masses of reference is not enough to become strong if you are confronted with a bold opponent. On the contrary, it might be a source of weakness. If you explicitly point out the papers you attach yourself to, it is then possible for the reader — if there still are any readers — to trace each reference and to probe its degree of attachment to your claim.” (emphasis added)

The work described in the publications above was supported in part by the National Science Foundation under several grants (to see which grants supported a particular paper, please consult the acknowledgments of that publication). Any opinions, findings, and conclusions or recommendations expressed above are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Lillian Lee's home page
Lillian Lee's research summary
Cornell Natural Language Processing (NLP) group.