On the Effectiveness of the Skew Divergence for Statistical Language Analysis
Lillian Lee.
Artificial Intelligence and Statistics 2001, pp. 65--72, 2001

Abstract: Estimating word co-occurrence probabilities is a problem underlying many applications in statistical natural language processing. Distance-weighted (or similarity-weighted) averaging has been shown to be a promising approach to the analysis of novel co-occurrences. Many measures of distributional similarity have been proposed for use in the distance-weighted averaging framework; here, we empirically study their stability properties, finding that similarity-based estimation appears to make more efficient use of more reliable portions of the training data. We also investigate properties of the skew divergence, a weighted version of the Kullback-Leibler (KL) divergence; our results indicate that the skew divergence yields better results than the KL divergence even when the KL divergence is applied to more sophisticated probability estimates.

Paper formats: ps, pdf

Data: http://www.cs.cornell.edu/home/llee/data/sim.html

BibTeX entry:


@InProceedings{Lee:01a,
  author =       {Lillian Lee},
  title =        {On the Effectiveness of the Skew Divergence for
  Statistical Language Analysis},
  booktitle =    "Artificial Intelligence and Statistics 2001",
  pages =        {65--72},
  year =         2001
}


Back links: Lillian Lee's home page or papers page; Cornell NLP page.