Code for binary SVM classification with kernels accompanying the paper 
'Training Structural SVMs with Kernels Using Sampled Cuts' (24 Sep 2008)
------------------------------------------------------------------------
(implemented by Chun-Nam Yu using some of the code in SVM^light by Thorsten Joachims, with SFMT Mersenne Twister RNG library, and the Mosek optimization software)

INSTALLATION
1. This code uses the SFMT random number generator library [1] and the Mosek optimization software [2]. The SFMT library is already included in the tarball but the Mosek optimization software has to be downloaded and installed separately. This code was developed using SFMT version 1.3.3 and Mosek version 5. 
2. Extract the files in the tarball to create the directory svm_sample_cuts. Extract the SFMT library tarball SFMT-src-1.3.3.tar.gz in the directory svm_sampled_cuts as well.
3. Download and install Mosek from www.mosek.com following the instructions on their website (they've provided a evaluation license and also a special free license for students)
4. After Mosek has been installed, modify the Makefile in svm_sampled_cuts by replacing 'your_mosek_directory' with the path where you install Mosek.
5. Type 'make' in the directory svm_sampled_cuts and make sure that the compilation is successful.


TRAINING
1. To use the uniform sampling (constant time) algorithm for training binary SVM, type:
   ./svm_uniform_sampling [options] TRAINING_FILE TEST_FILE MODEL_FILE
   To use the importance sampling (linear time) algorithm for training binary SVM, type:
   ./svm_importance_sampling [options] TRAINING_FILE TEST_FILE MODEL_FILE

The format of the training input file and the model output file are the same as that of SVM^light. For details of the above algorithms please refer to [3].

2. Before running the program you need to set the environment variables LD_LIBRARY_PATH to include the Mosek library path 'your_mosek_directory/mosek/5/tools/platform/linux32x86/bin/' for 32 bit linux version or 'your_mosek_directory/mosek/5/tools/platform/linux64x86/bin/' for 64 bit linux version. You also need to set MOSEKLM_LICENSE_FILE to point to your Mosek license file.

3. The following command line options are available during training:
   -c : specify regularization constant
   -e : specify stopping criterion epsilon
   -s : specify sample size
   -t : specify kernel type (0=linear, 1=polynomial, 2=rbf)
   -g : specify kernel width for rbf kernel
   -d : specify the degree for polynomial kernel

   
CLASSIFICATION 
1. The code for classification is the same as those of SVM^light. To classify a test set, type:
  ./svm_classify TEST_FILE MODEL_FILE


CONTACT
If you had any suggestions to the program or have bugs to report, you can email Chun-Nam Yu at cnyu@cs.cornell.edu. Any feedback is very welcome. 


REFERENCES
[1] M. Saito and M. Matsumoto: SIMD-oriented fast mersenne twister: a 128-bit pseudorandom number generator, Monte Carlo and Quasi-Monte Carlo Methods 2006 [http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html]
[2] The Mosek Optimization Software [www.mosek.com]
[3] C.-N. J. Yu and T. Joachims: Training Structural SVMs with Kernels Using Sampled Cuts, KDD 2008
