Bootstrapping Coreference Classifiers with Multiple Machine Learning Algorithms

Vincent Ng and Claire Cardie.
2003 Coreference on Empirical Methods in Natural Language Processing (EMNLP), 2003.

Click here for the PostScript or PDF version.

Abstract

Successful application of multi-view co-training algorithms relies on the ability to factor the available features into views that are compatible and uncorrelated. This can potentially preclude their use on problems such as coreference resolution that lack an obvious feature split. To bootstrap coreference classifiers, we propose and evaluate a single-view weakly supervised algorithm that relies on two different learning algorithms in lieu of the two different views required by co-training. In addition, we investigate a method for ranking unlabeled instances to be fed back into the bootstrapping loop as labeled data, aiming to alleviate the problem of performance deterioration that is commonly observed in the course of bootstrapping.