The objective in Generalized Multiple Kernel Learning (GMKL) is to jointly learn both kernel and SVM parameters. The optimizer proposed here can learn any combination of base kernels subject to any regularization. The code currently handles both sum (linear) and product (non-linear) of kernels subject to L1 and p-norm regularization, but it can be easily extended to handle any other MKL formulation. The package also implements: (1) the highly optimized SMO-MKL (Vishwanathan et al. 2010), which is a specialized optimizer to handle linear combination of kernels; and (2) the original projected gradient descent (PGD) based GMKL algorithm by Varma and Babu 2009, so that our tables can be reproduced. The code currently supports binary classification and regression.

The following is very efficient code for training GMKL using wrapper method. The optimizer is general purpose and can handle any MKL formulation. The code has been built on top of the LibSVM code base and therefore has very similar usage. Please go through the included README file for detailed usage instructions as well as the LibSVM FAQ and COPYRIGHT.

Why should you use SPG?

Our code is highly scalable: scaling upto a million kernels (size 160GB) on Sonar and more than half million training points (with 5 kernels, size 6.3TB) on Covertype. Only 0.19% of SV were cached in Covertype (14GB RAM) and SPG managed to converge in 64hrs with only 26 inner SVM evaluations where 1st SVM evaluation took 44hrs. SPG is also 10 times faster than specialized MKL solvers like Shogun and more than 100 times faster than SimpleMKL on many datasets. SPG can also be used to learn non-linear combination of kernels on large scale datasets (eg. Cod-RNA > 50k training points) which presents highly non-convex and challenging optimization problem. SPG and PGD are the only two algorithms which can learn non-linear combination of kernels and SPG is more than 1000 times faster than PGD on many datasets.

Download source code    

The code is in C++ and should compile on 32/64 bit Windows/Linux machines. This code is made available as is for non-commercial research purposes. Please contact Ashesh Jain [ashesh [at] cs.cornell.edu], Manik Varma [manik [at] microsoft.com] and S. V. N. Vishwanathan [vishy [at] stat.purdue.edu] if you have any questions or feedback.


Home Page