Here are papers using our Movie Review Data, listed in roughly chronological order (will eventually be alphabetized within year).
  1. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. Proceedings of EMNLP, 2002. Introduced polarity dataset v0.9.
  2. Kushal Dave, Steve Lawrence, and David M. Pennock. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. Proceedings of WWW, 2003.
  3. Stephen D. Durbin, J. Neal Richter, and Doug Warner. A System for Affective Rating of Texts. Proceedings of the KDD Workshop on Operational Text Classification Systems (OTC-3), 2003.
  4. Fuchun Peng. Language Independent Text Learning with Statistical n-gram Language Models. PhD thesis, University of Waterloo, 2003.
  5. Franco Salvetti, Stephen Lewis, and Christoph Reichenbach. Impact of Lexical Filtering on Overall Opinion Polarity Identification. Proceedings of the AAAI Symposium on Exploring Attitude and Affect in Text: Theories and Applications, 2004.
  6. Sreenivasa P. Sista and S. H. Srinivasan. Polarized Lexicon for Review Classification. Proceedings of the International Conference on Machine Learning; Models, Technologies & Applications, 2004.
  7. Philip Beineke, Trevor Hastie, and Shivakumar Vaithyanathan. The Sentimental Factor: Improving Review Classification via Human-Provided Information. Proceedings of the ACL, 2004.
  8. Bo Pang and Lillian Lee. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. Proceedings of the ACL, 2004. Introduced polarity dataset v2.0, subjectivity dataset v1.0.
  9. Tony Mullen and Nigel Collier. Sentiment Analysis using Support Vector Machines with Diverse Information Sources. Proceedings of EMNLP, 2004.
  10. Evgeniy Gabrilovich and Shaul Markovitch . Text Categorization with Many Redundant Features: Using Aggressive Feature Selection to Make SVMs Competitive with C4.5. ICML 2004.
  11. Edoardo M. Airoldi, William W. Cohen, Stephen E. Fienberg. Bayesian methods for frequent terms in text: Models of contagion and the Delta square statistic. Proceedings of the CSNA & INTERFACE Annual Meetings (2005)
  12. Agarwal Alekh and Pushpak Bhattaccharyya. Sentiment analysis: a new approach for effective use of linguistic knowledge and exploiting similarities in a set of documents to be classified. ICON 2005.
  13. Anthony Aue and Michael Gamon. Customizing Sentiment Classifiers to New Domains: a Case Study. Proceedings of RANLP, 2005.
  14. Pimwadee Chaovalit and Lina Zhou. Movie Review Mining: a Comparison between Supervised and Unsupervised Classification Approaches. Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS), 2005.
  15. Jeremy Fletcher and Jon Patrick. Evaluating the Utility of Appraisal Hierarchies as a Method for Sentiment Classification . Proceedings of the Australasian Language Technology Workshop, 2005.
  16. Alistair Kennedy and Diana Inkpen. Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters. Workshop on the Analysis of Informal and Formal Information Exchange during Negotiations (FINEXIN 2005).
  17. Shotaro Matsumoto, Hiroya Takamura, and Manabu Okumura. Sentiment Classification using Word Sub-Sequences and Dependency Sub-Tree. Proceedings of PAKDD, 2005.
  18. Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. Proceedings of the ACL, 2005. Introduced scale dataset v1.0 and a positive/negative sentence collection.
  19. Jonathon Read. Using Emoticons to Reduce Dependency in Machine Learning Techniques for Sentiment Classification. ACL Student Research Workshop, 2005.
  20. Jun Suzuki. Kernels for structured data in natural language processing. PhD Thesis, Nara Institute of Science and Technology, 2005.
  21. Casey Whitelaw, Navendu Garg, and Shlomo Argamon. Using Appraisal Taxonomies for Sentiment Analysis. Second Midwest Computational Linguistic Colloquium (MCLC 2005).
  22. Agarwal Alekh and Pushpak Bhattaccharyya. Augmenting Wordnet with Polarity Information on Adjectives. 3rd Global Wordnet Conference, 2006.
  23. Edoardo M. Airoldi, Xue Bai, Rema Padman. Markov blankets and meta-heuristic search: Sentiment extraction from unstructured text. Lecture Notes in Computer Science. vol. 3932 (2006)
  24. Andrew B. Goldberg and Xiaojin Zhu. Seeing stars when there aren't many stars: Graph-based semi-supervised learning for sentiment categorization. HLT-NAACL 2006 Workshop on Textgraphs: Graph-based Algorithms for Natural Language Processing.
  25. Yi Mao and Guy Lebanon. Sequential models for sentiment prediction. ICML Workshop on Learning in Structured Output Spaces, 2006.
  26. Vincent Ng, Sajib Dasgupta, and S. M. Niaz Arifin. Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. Proceedings of the COLING/ACL Poster Sessions, 2006.
  27. Ellen Riloff, Siddharth Parwardhan, and Janyce Wiebe. Feature subsumption of opinion analysis. Proceedings of EMNLP, 2006.
  28. Arnd Christian König and Eric Brill. Reducing the Human Overhead in Text Categorization. Proceedings of KDD, 2006.
  29. Hui Yang Luo Si, and Jamie Callan. Knowledge Transfer and Opinion Detection in the TREC2006 Blog Track. Text REtrieval Conference 2006 (TREC2006), Gathersburgh, MD, Nov 14-17 2006.
  30. Mostafa Al Masum Shaikh, Helmut Prendinger and Mitsuru Ishizuka. An Analytical Approach to Assess Sentiment of Text. In Proc.(CD-ROM) Int'l Conf. on Computer and Information Technology (ICCIT 2007)
  31. Mostafa Al Masum Shaikh, Helmut Prendinger and Mitsuru Ishizuka. Assessing Sentiment of Text by Semantic Dependency and Contextual Valence Analysis. ACII 2007 (LNCS 4738).
  32. Shlomo Argamon, Casey Whitelaw, Paul Chase, Shushant Dhawle, Sobhan Raj Hota, Navendy Garg, and Shlomo Levitan. Stylist text classification using functional lexical features. JASIST 58, 2007.
  33. David Blei and Jon McAuliffe. Supervised topic models. NIPS 2007.
  34. Kenneth Bloom and Navendu Garg and Shlomo Argamon. Extracting Appraisal Expressions. NAACL HLT 2007, pp. 308--315.
  35. Erik Boiy; Pieter Hens; Koen Deschacht; Marie-Francine Moens. Automatic Sentiment Analysis in On-line Text. Openness in Digital Publishing: Awareness, Discovery and Access - Proceedings of the 11th International Conference on Electronic Publishing (ELPUB), 2007.
  36. Surajit Chaudhuri, Kenneth Church, Arnd Christian Konig, Liying Sui. Heavy-Tailed Distributions and Multi-Keyword Queries. SIGIR 2007.
  37. Shoushan Li, Chengqing Zong, and Xia Wang. Sentiment Classification through Combining Classifiers with Multiple Feature Sets. Natural Language Processing and Knowledge Engineering, 2007.
  38. Yi Mao and Guy Lebanon. Isotonic Conditional Random Fields and Local Sentiment Flow. NIPS 2007.
  39. Deanna Osman, John Yearwood, and Peter Vamplew. Using corpus analysis to inform research into opinion detection in blogs. Australasian data mining conference, 2007.
  40. Stephan Raaijmakers. Sentiment Classification with Interpolated Information Diffusion Kernels. Data Mining and Audience Intelligence for Advertising (ADKDD) 2007.
  41. Kimitaka Tsutsumi, Kazutaka Shimada, and Tsutomu Endo. Movie Review Classification Based on a Multiple Classifier. PACLIC, pp. 481-488, 2007.
  42. Kiduk Yang and Ning Yu and Hui Zhang. WIDIT in TREC 2007 Blog Track: combining lexiconbased methods to detect opinionated blogs. TREC 2007.
  43. Omar F. Zaidan, Jason Eisner, and Christine Piatko. Using “Annotator Rationales” to Improve Machine Learning for Text Categorization. NAACL HLT 2007.
  44. Qi Zhang, Bingqing Wang, Lide Wu, Xuanjing Huang. FDU at TREC 2007: opinion retrieval of Blog Track. TREC 2007.
  45. Xiaojin Zhu and Andrew B. Goldberg. Kernel regression with order preferences. AAAI 2007.
  46. Ritesh Agarwal, T. V. Prabhakar, Sugato Chakrabarty."I know what you feel": analyzing the role of conjunctions in automatic sentiment analysis. Proceedings of the 6th internationl conference on advances in natural language processing, 2008.
  47. Ben Allison. Sentiment Detection Using Lexically-Based Classifiers. Proceedings of TSD 2008.
  48. Alina Andreevskaia and Sabine Bergler. When specialists and generalists work together: Overcoming domain dependence in sentiment tagging. ACL 2008.
  49. Jake Bartlett and Russ Albright. Coming to a Theater Near You! Sentiment Classification Techniques Using SAS Text Miner. SAS Global Forum 2008.
  50. Bo Chen, Hui He, Jun Guo. Constructing Maximum Entropy Language Models for Movie Review Subjectivity Analysis. Journal of Computer Science and Technology 23(2): 231-- 239, 2008.
  51. Harb et al, Web Opinion Mining: How to extract opinions from blogs?. International Conference on Soft Computing as Transdisciplinary Science and Technology, 2008.
  52. Daisuke Ikeda, Hiroya Takamura, Lev-Arie Ratinov, Manabu Okumura. Learning to Shift the Polarity of Words for Sentiment Classification. IJCNLP 2008.
  53. D. Li, A. Laurent, M. Roche and P. Poncelet. Extraction of Opposite Sentiments in Classified Free Format Text Reviews. In Proceedings of the 19th International Conference on Database and Expert Systems Applications (DEXA 08).
  54. Vikas Sindhwani and Prem Melville. Document-Word Co-Regularization for Semi-supervised Sentiment Analysis. Extended version of a paper at ICDM 2008.
  55. Fangzhong Su and Katja Markert. From Words to Senses: a Case Study in Subjectivity Recognition. COLING 2008.
  56. Pu Wang and Carlotta Domeniconi. Building semantic kernels for text classification using Wikipedia. KDD 2008.
  57. Omar F. Zaidan and Jason Eisner. Modeling Annotators: A Generative Approach to Learning from Annotator Rationales. EMNLP 2008.
  58. Yi Zhang, Arun Surendran, John Platt, and Mukund Narasimhan. Learning from multi-topic web documents for contextual advertisement. KDD 2008.
  59. Yi Mao and Guy Lebanon. Domain Knowledge Uncertainty and Probabilistic Parameter Constraints. Proc. of the 25th Conference on Uncertainty in Artificial Intelligence (UAI), 2009.
  60. Yi Mao and Guy Lebanon. Generalized isotonic conditional random fields . Machine Learning, 2009.
Data used in demos or distributed software:
  1. Bob Carpenter. Sentiment classification tutorial. http://alias-i.com/lingpipe/demos/tutorial/sentiment/read-me.html, 2005.
  2. Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing in Python. Forthcoming book (as of summer 2008)