IDF revisited: A simple new derivation within the Robertson-Spärck Jones probabilistic model.
Lillian Lee.
Proceedings of SIGIR, pp. 751–752, 2007. Poster paper.

Abstract: There have been a number of prior attempts to theoretically justify the effectiveness of the inverse document frequency (IDF). Those that take as their starting point Robertson and Spärck Jones's probabilistic model are based on strong or complex assumptions. We show that a more intuitively plausible assumption suffices. Moreover, the new assumption, while conceptually very simple, provides a solution to an estimation problem that had been deemed intractable by Robertson and Walker (1997).

Paper formats: ps, pdf

Poster “slides”: ps, pdf

BibTeX entry:


@InProceedings{Lee:07a,
   author={Lillian Lee},
   title={IDF revisited: A simple new derivation within the Robertson-Sp\"arck Jones probabilistic model},
   booktitle={Proceedings of SIGIR},
   year={2007},
   pages={751--752},
   annote={Poster paper}
}


Back links: Lillian Lee's home page or papers page; Cornell NLP page