The dataset distributed on this webpage consists of 1346 hand-annotated documents drawn from the top 20 webpages retrieved by the Yahoo! search engine in response to 69 real and publicly-available user queries. The annotations indicate whether the documents are subjective or objective. Please see the README or the paper linked below for more details.


This data was introduced in Bo Pang and Lillian Lee, Using very simple statistics for review search: An exploration, Proceedings of COLING: Companion volume: Posters, pp. 73–76, 2008.

Data download

ss_data.tar.gz (23MB, tar.gz format), including ssdata.README.1.0.txt, September 2008.
