Movie Review Data

This page is a distribution site for movie-review data for use in sentiment-analysis experiments. Available are collections of movie-review documents labeled with respect to their overall sentiment polarity (positive or negative) or subjective rating (e.g., "two and a half stars") and sentences labeled with respect to their subjectivity status (subjective or objective) or polarity. These data sets were introduced in the following papers:

If you have results to report on these corpora, please send email to Bo Pang and/or Lillian Lee so we can add you to our list of other papers using this data. Thanks!
Rationale: we're (admittedly haphazardly and only occcasionally) maintaining that list for the purposes of facilitating comparison of results.



Please cite the version number of the dataset you used in any publications, in order to facilitate comparison of results. Thank you.

Sentiment polarity datasets

Sentiment scale datasets

Subjectivity datasets


The creation of this website is based upon work supported in part by the National Science Foundation (NSF) under grant no. ITR/IM IIS-0081334, IIS-0329064, CCR-0122581, and BES-0329549; SRI International under subcontract no. 03-000211 on their project funded by the Department of the Interior, National Business Center; a Cornell Graduate Fellowship in Cognitive Studies; and by an Alfred P. Sloan Research Fellowship. Any opinions, findings, and conclusions or recommendations expressed above are those of the authors and do not necessarily reflect the views of the National Science Foundation or Sloan Foundation and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity.

If you have any questions or comments regarding this site, please send email to Bo Pang or Lillian Lee.



NLP at Cornell