Many problems in data mining and machine learning, both in supervised and unsupervised learning, depend crucially on the choice of an appropriate distance or similarity measure. The appropriateness of such a measure can ultimately dictate the success or failure of the learning algorithm, but its choice is highly problem and application dependent. As a result, there have been several recent data-driven approaches that attempt to learn distance measures.
In this talk, I will present a new approach to metric and kernel learning using the Log-Determinant divergence. The Log-Determinant divergence has previously been used in statistics, where it is called Stein's Loss, and in numerical optimization, where it has been used to show superlinear convergence of the well-known BFGS quasi-Newton method. Our metric learning approach has the following desirable properties: (a) the metric learning problem is equivalent to a kernel learning problem, (b) the method can generalize to unseen data points, (c) the method can improve upon an input metric that may be provided by an application expert, and (d) the algorithm does not require any expensive eigenvector computation r semi-definite programming. I will present results on semi-supervised clustering, nearest neighbor error reporting for software programs, and image classification.
This is joint work with Jason Davis, Prateek Jain, Brian Kulis and Suvrit Sra.