Here are papers using our Movie Review Data, listed in roughly chronological order by publication date.
  1. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. Proceedings of EMNLP, 2002. Introduced polarity dataset v0.9.
  2. Kushal Dave, Steve Lawrence, and David M. Pennock. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. Proceedings of WWW, 2003.
  3. Stephen D. Durbin, J. Neal Richter, and Doug Warner. A System for Affective Rating of Texts. Proceedings of the KDD Workshop on Operational Text Classification Systems (OTC-3), 2003.
  4. Fuchun Peng. Language Independent Text Learning with Statistical n-gram Language Models. PhD thesis, University of Waterloo, 2003.
  5. Franco Salvetti, Stephen Lewis, and Christoph Reichenbach. Impact of Lexical Filtering on Overall Opinion Polarity Identification. Proceedings of the AAAI Symposium on Exploring Attitude and Affect in Text: Theories and Applications, 2004.
  6. Sreenivasa P. Sista and S. H. Srinivasan. Polarized Lexicon for Review Classification. Proceedings of the International Conference on Machine Learning; Models, Technologies & Applications, 2004.
  7. Philip Beineke, Trevor Hastie, and Shivakumar Vaithyanathan. The Sentimental Factor: Improving Review Classification via Human-Provided Information. Proceedings of the ACL, 2004.
  8. Bo Pang and Lillian Lee. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. Proceedings of the ACL, 2004. Introduced polarity dataset v2.0, subjectivity dataset v1.0.
  9. Tony Mullen and Nigel Collier. Sentiment Analysis using Support Vector Machines with Diverse Information Sources. Proceedings of EMNLP, 2004.
  10. Evgeniy Gabrilovich and Shaul Markovitch . Text Categorization with Many Redundant Features: Using Aggressive Feature Selection to Make SVMs Competitive with C4.5. The 21st International Conference on Machine Learning (ICML) , pp. 321-328, Banff, Alberta, Canada, July 2004
  11. Edoardo M. Airoldi, William W. Cohen, Stephen E. Fienberg. Bayesian methods for frequent terms in text: Models of contagion and the Delta square statistic. Proceedings of the CSNA & INTERFACE Annual Meetings (2005)
  12. Pimwadee Chaovalit and Lina Zhou. Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches. Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS), 2005.
  13. Casey Whitelaw, Navendu Garg, and Shlomo Argamon. Using Appraisal Taxonomies for Sentiment Analysis. Second Midwest Computational Linguistic Colloquium (MCLC 2005).
  14. Alistair Kennedy and Diana Inkpen. Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters. Workshop on the Analysis of Informal and Formal Information Exchange during Negotiations (FINEXIN 2005).
  15. Shotaro Matsumoto, Hiroya Takamura, and Manabu Okumura. Sentiment Classification using Word Sub-Sequences and Dependency Sub-Tree. Proceedings of PAKDD, 2005.
  16. Jun Suzuki. Kernels for structured data in natural language processing. PhD Thesis, Nara Institute of Science and Technology, 2005.
  17. Jonathon Read. Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification. ACL Student Research Workshop, 2005.
  18. Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. Proceedings of the ACL, 2005. Introduced scale dataset v1.0 and a positive/negative sentence collection.
  19. Anthony Aue and Michael Gamon. Customizing Sentiment Classifiers to New Domains: a Case Study. Proceedings of RANLP, 2005.
  20. Agarwal Alekh and Pushpak Bhattaccharyya. Sentiment analysis: a new approach for effective use of linguistic knowledge and exploiting similarities in a set of documents to be classified. ICON 2005.
  21. Agarwal Alekh and Pushpak Bhattaccharyya. Augmenting Wordnet with Polarity Information on Adjectives. 3rd Global Wordnet Conference, 2006.
  22. Edoardo M. Airoldi, Xue Bai, Rema Padman. Markov blankets and meta-heuristic search: Sentiment extraction from unstructured text. Lecture Notes in Computer Science. vol. 3932 (2006)
  23. Andrew B. Goldberg and Xiaojin Zhu. Seeing stars when there aren't many stars: Graph-based semi-supervised learning for sentiment categorization. HLT-NAACL 2006 Workshop on Textgraphs: Graph-based Algorithms for Natural Language Processing.
  24. Yi Mao and Guy Lebanon. Sequential models for sentiment prediction. ICML Workshop on Learning in Structured Output Spaces, 2006.
  25. Vincent Ng, Sajib Dasgupta, and S. M. Niaz Arifin. Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. Proceedings of the COLING/ACL Poster Sessions, 2006.
  26. Ellen Riloff, Siddharth Parwardhan, and Janyce Wiebe. Feature subsumption of opinion analysis. Proceedings of EMNLP, 2006.
  27. Arnd Christian König and Eric Brill. Reducing the Human Overhead in Text Categorization. Proceedings of KDD, 2006.
  28. Hui Yang Luo Si, and Jamie Callan. Knowledge Transfer and Opinion Detection in the TREC2006 Blog Track. Text REtrieval Conference 2006 (TREC2006), Gathersburgh, MD, Nov 14-17 2006.
  29. Shlomo Argamon, Casey Whitelaw, Paul Chase, Shushant Dhawle, Sobhan Raj Hota, Navendy Garg, and Shlomo Levitan. Stylist text classification using functional lexical features. JASIST, to appear.
  30. Yi Mao and Guy Lebanon. Isotonic Conditional Random Fields and Local Sentiment Flow. NIPS 2007.
  31. Surajit Chaudhuri, Kenneth Church, Arnd Christian Konig, Liying Sui. Heavy-Tailed Distributions and Multi-Keyword Queries. SIGIR 2007.
  32. Xiaojin Zhu and Andrew B. Goldberg. Kernel regression with order preferences. AAAI 2007.
  33. Kenneth Bloom and Navendu Garg and Shlomo Argamon. Extracting Appraisal Expressions. NAACL HLT 2007, pp. 308--315.
  34. Deanna Osman, John Yearwood, and Peter Vamplew. Using corpus analysis to inform research into opinion detection in blogs. Australasian data mining conference, 2007.
  35. Kimitaka Tsutsumi, Kazutaka Shimada, and Tsutomu Endo. Movie Review Classification Based on a Multiple Classifier. PACLIC, pp. 481-488, 2007.
  36. Erik Boiy; Pieter Hens; Koen Deschacht; Marie-Francine Moens. Automatic Sentiment Analysis in On-line Text. Openness in Digital Publishing: Awareness, Discovery and Access - Proceedings of the 11th International Conference on Electronic Publishing (ELPUB), 2007.
  37. Stephan Raaijmakers. Sentiment Classification with Interpolated Information Diffusion Kernels. Data Mining and Audience Intelligence for Advertising (ADKDD) 2007.
  38. Omar F. Zaidan, Jason Eisner, and Christine Piatko. Using “Annotator Rationales” to Improve Machine Learning for Text Categorization. NAACL HLT 2007.
  39. David Blei and Jon McAuliffe. Supervised topic models. NIPS 2007.
  40. Ben Allison. Sentiment Detection Using Lexically-Based Classifiers. Proceedings of TSD '08.
  41. Jake Bartlett and Russ Albright. Coming to a Theater Near You! Sentiment Classification Techniques Using SAS Text Miner. SAS Global Forum 2008.
  42. Bo Chen, Hui He, Jun Guo. Constructing Maximum Entropy Language Models for Movie Review Subjectivity Analysis. Journal of Computer Science and Technology 23(2): 231--239, 2008.
  43. Fangzhong Su and Katja Markert. From Words to Senses: a Case Study in Subjectivity Recognition. COLING 2008.
Data used in demos or distributed software:
  1. Bob Carpenter. Sentiment classification tutorial. http://alias-i.com/lingpipe/demos/tutorial/sentiment/read-me.html, 2005.
  2. Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing in Python. Forthcoming book (as of summer 2008)