NSF-Project IIS-0905467
Cornell University
Department of Computer Science
The goal of the project is to harness the information contained in users' interactions with information systems (e.g. query reformulations, clicks, dwell time) to train those systems to better serve their users' information needs. The key challenge lies in properly interpreting this implicit feedback and collecting it in a way that provides valid training data. Moving beyond existing passive data collection methods, the project draws on multi-armed bandit algorithms, experiment design, and machine learning to actively collect implicit feedback data. Developing these interactive experimentation methods goes hand-in-hand with developing machine learning algorithms that can use the resulting training data, and empirical evaluations that validate the models of user behavior assumed by the algorithms.
This research will improve retrieval quality for important applications like intranet search and desktop search. Additionally, the project will provide an operational full-text search engine for the Physics E-Print ArXiv and potentially other digital libraries, thus forming a test-bed for the research while also providing a valuable service and dissemination tool to the academic community beyond computer science. Including an REU Supplement, the project provides interesting and motivating research opportunities to undergrads and international exchange students, and the PI's will include relevant material into the undergraduate and graduate curriculum. Finally, the project will provide easy-to-use software that enables research and teaching, via this project website.
| [Yue/Joachims/09a] |
Yisong Yue and T.
Joachims, Interactively
Optimizing Information Retrieval Systems as a Dueling Bandits Problem,
Proceedings of the International Conference on Machine Learning (ICML),
2009. [PDF] [BibTeX] |
| [Yue/etal/09a] |
Yisong Yue and J.
Broder and R. Kleinberg and T. Joachims, The
K-armed Dueling Bandits Problem, Proceedings of the Conference on
Learning Theory (COLT), 2009. [PDF] [BibTeX] |
| [Radlinski/etal/08b] |
F. Radlinski, M. Kurup, T. Joachims, How Does Clickthrough Data Reflect Retrieval Quality?, Proceedings of the ACM
Conference on Information and Knowledge Management (CIKM), 2008. [PDF] [BibTeX] |
| [Yue/Joachims/08a] |
Yisong Yue and T.
Joachims, Predicting
Diverse Subsets Using Structural SVMs, Proceedings of the
International Conference on Machine Learning (ICML), 2008. [PDF] [BibTeX] [Software] |
| [Radlinski/etal/08a] |
F. Radlinski and R.
Kleinberg and T. Joachims, Learning
Diverse Rankings with Multi-Armed Bandits, Proceedings of the
International Conference on Machine Learning (ICML), 2008. [PDF] [BibTeX] |
| [Radlinski/Joachims/07a] |
F. Radlinski, T.
Joachims, Active
Exploration for Learning Rankings from Clickthrough Data, Proceedings
of the ACM Conference on Knowledge Discovery and Data Mining (KDD), ACM, 2007. [PDF] [BibTeX] |
| [Joachims/Radlinski/07a] |
T. Joachims, F.
Radlinski, Search
Engines that Learn from Implicit Feedback, IEEE Computer, Vol. 40,
No. 8,August,
2007. [IEEE Digital Library] [BibTeX] [Software] |
This material is based upon work supported by the National Science Foundation under CAREER Award IIS-0905467. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation (NSF).