Thumbs up? Sentiment classification using machine learning techniques
Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan
Proceedings of EMNLP, pp. 79--86, 2002

We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging.

@inproceedings{Pang+Lee+Vaithyanathan:02a, author = {Bo Pang and Lillian Lee and Shivakumar Vaithyanathan}, title = {Thumbs up? Sentiment classification using machine learning techniques}, year = {2002}, pages = {79--86}, booktitle = {Proceedings of EMNLP} }

This paper is based upon work supported in part by the National Science Foundation under ITR/IM grant IIS-0081334. Any opinions, findings, and conclusions or recommendations expressed above are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Navigation: