Better than the real thing? Iterative pseudo-query processing using cluster-based language models
Oren Kurland, Lillian Lee, and Carmel Domshlak
Proceedings of SIGIR, pp. 19--26, 2005

We present a novel approach to pseudo-feedback-based ad hoc retrieval that uses language models induced from both documents and clusters. First, we treat the pseudo-feedback documents produced in response to the original query as a set of pseudo-queries that themselves can serve as input to the retrieval process. Observing that the documents returned in response to the pseudo-queries can then act as pseudo-queries for subsequent rounds, we arrive at a formulation of pseudo-query-based retrieval as an iterative process. Experiments show that several concrete instantiations of this idea, when applied in conjunction with techniques designed to heighten precision, yield performance results rivaling those of a number of previously-proposed algorithms, including the standard language-modeling approach. The use of cluster-based language models is a key contributing factor to our algorithms' success.

@inproceedings{Kurland+Lee+Domshlak:05a, author = {Oren Kurland and Lillian Lee and Carmel Domshlak}, title = {Better than the real thing? Iterative pseudo-query processing using cluster-based language models}, year = {2005}, pages = {19--26}, booktitle = {Proceedings of SIGIR} }

This material is based upon work supported in part by the U.S. National Science Foundation (NSF) under grant no. IIS-0329064 and CCR-0122581; SRI International under subcontract no. 03-000211 on their project funded by the Department of the Interior's National Business Center; and by an Alfred P. Sloan Research Fellowship. Any opinions, findings, and conclusions or recommendations expressed are those of the authors and do not necessarily reflect the views or official policies, either expressed or implied, of any sponsoring institutions, the U.S. government, or any other entity