## SVM
^{struct} ## Support Vector Machine for Complex OutputsAuthor: Thorsten Joachims <thorsten@joachims.org> Version: 3.00 |

*SVM ^{struct}* is a Support Vector Machine (SVM) algorithm for predicting multivariate
or structured outputs. It performs supervised learning by approximating a mapping

using labeled training examples `(x _{1},y_{1}), ..., (x_{n},y_{n})`. Unlike regular SVMs, however, which consider only univariate predictions like in classification and regression,

The 1-slack cutting-plane algorithm implemented in *SVM ^{struct}*
V3.00 uses a new but equivalent formulation of the structural SVM quadratic
program and is several orders of magnitude faster than prior methods. The
algorithm is described in a forthcoming paper. The n-slack algorithm of

*SVM ^{struct}* can be thought of as an API for implementing different kinds of complex prediction algorithms. Currently, we have implemented the following learning tasks:

: A python interface to*SVM*^{python}*SVM*for implementing your own instance of^{struct}*SVM*. The Python interface make prototyping much easier and faster than working in C.^{struct}

More information and source code.: Multi-class classification. Learns to predict one of*SVM*^{multiclass}`k`mutually exclusive classes. This is probably the simplest possible instance of*SVM*and serves as a tutorial example of how to use the programming interface.^{struct}

More information and source code.: Learns a weighted context free grammar from examples. Training examples (e.g. for natural language parsing) specify the sentence along with the correct parse tree. The goal is to predict the parse tree of new sentences.*SVM*^{cfg}

More information and source code.: Learning to align sequences. Given examples of how sequence pairs align, the goal is to learn a complex substitution and insertion/deletion model so that one can predict alignments of new sequences.*SVM*^{align}

More information and source code.: Learns a hidden Markov model from examples. Training examples (e.g. for part-of-speech tagging) specify the sequence of words along with the correct assignment of tags (i.e. states). The goal is to predict the tag sequences for new sentences.*SVM*^{hmm}

More information and source code.: Learns rankings that optimize Mean Average Precision (MAP) as the performance metric.*SVM*^{map}

More information and source code.: Learns a binary classification rule that directly optimizes ROC-Area, F1-Score, or the Precision/Recall Break-Even Point. It is also a training algorithm for conventional linear binary classification SVMs that can be orders of magnitude faster than SVM-light for large datasets.*SVM*^{perf}

More information and source code.

- A function for computing the feature vector Psi.
- A function for computing the argmax over the (kernelized) linear discriminant function.
- A function for computing the argmax over the loss-augmented (kernelized) linear discriminant function.
- A loss function.

http://download.joachims.org/svm_struct/v3.00/svm_struct.tar.gz

If you are not so eager on C programming, then you might want to look at the
Python interface to *SVM ^{struct}*
that Thomas Finley wrote. It is substantially easier to prototype in than the C
version and offers the same functionality. Also, the API in Python is identical
in its structure to the one described below, so it is easy to switch between
them.

If you decide to use the C version, the file you downloaded above contains the source code of the most recent version of *SVM ^{struct}* as well as the source code of the

gunzip –c svm_struct.tar.gz | tar xvf –This expands the archive into the current directory, which now contains all relevant files. You can compile

makein the root directory of the archive. It will output some warnings, since the functions of the API are only templates and do not return values as required. However, it should produce the executables

`svm_struct_api_types.h``svm_struct_api.c`

svm_empty_learn -c 1.0 train.dat model.datwhich trains an SVM on the training set

General options: -? -> this help -v [0..3] -> verbosity level (default 1) -y [0..3] -> verbosity level for svm_light (default 0) Learning options: -c float -> C: trade-off between training error and margin (default 0.01) -p [1,2] -> L-norm to use for slack variables. Use 1 for L1-norm, use 2 for squared slacks. (default 1) -o [1,2] -> Rescaling method to use for loss. 1: slack rescaling 2: margin rescaling (default 2) -l [0..] -> Loss function to use. 0: zero/one loss (default 0) Kernel options: -t int -> type of kernel function: 0: linear (default) 1: polynomial (s a*b+c)^d 2: radial basis function exp(-gamma ||a-b||^2) 3: sigmoid tanh(s a*b + c) 4: user defined kernel from kernel.h -d int -> parameter d in polynomial kernel -g float -> parameter gamma in rbf kernel -s float -> parameter s in sigmoid/poly kernel -r float -> parameter c in sigmoid/poly kernel -u string -> parameter of user defined kernel Optimization options (see [2][3]): -w [1,2,3,4]-> choice of structural learning algorithm (default 3): 1: algorithm described in [2] 2: joint constraint algorithm (primal) [to be published] 3: joint constraint algorithm (dual) [to be published] 4: joint constraint algorithm (dual) with constr. cache -q [2..] -> maximum size of QP-subproblems (default 10) -n [2..q] -> number of new variables entering the working set in each iteration (default n = q). Set n < q to prevent zig-zagging. -m [5..] -> size of cache for kernel evaluations in MB (default 40) (used only for -w 1 with kernels) -f [5..] -> number of constraints to cache for each example (default 30) (used with -w 4) -e float -> eps: Allow that error for termination criterion (default 0.100000) -h [5..] -> number of iterations a variable needs to be optimal before considered for shrinking (default 100) -k [1..] -> number of new constraints to accumulate before recomputing the QP solution (default 100) (-w 1 only) -# int -> terminate QP subproblem optimization, if no progress after this number of iterations. (default 100000) Output options: -a string -> write all alphas to this file after learning (in the same order as in the training set) Structure learning options: --* string -> custom parameters that can be adapted for struct learning. The * can be replaced by any character and there can be multiple options starting with --.For more details on the meaning of these options consult references [1][3] and the description of

This software is free only for non-commercial use. It must not be distributed without prior permission of the author. The author is not responsible for implications from the use of this software.

- On 64-bit AMD and Intel machines, there can be a precision problem when using the w3 or w4 algorithms on problems with many features. To fix this problem, change "#define CFLOAT" to "double" in svm_common.h. Thank you to Chih-Jen Lin for pointing this out.
- The RBF kernel is broken.

- This version implements a new algorithm for structural SVM training (options -w 2, -w 3, -w 4). The algorithm is based on an alternative formulation of the structural SVM training problem that has the same solution as the conventional formulation. This new one-slack formulation allows a cutting-plane algorithm that has time complexity linear in the number of training examples. For large datasets, it is several orders of magnitude faster than the old cutting-plane algorithm.
- New IO routines that are faster for reading large data and model files.
- Source code for
*SVM*V2.00.^{struct}

**[1]** I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004. [Postscript
(gz)] [PDF]

**[2]** T. Joachims. Learning to Align Sequences: A Maximum Margin Approach, Technical Report, August, 2003. [Postscript
(gz)] [PDF]

**[3]** T. Joachims, Making Large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola (ed.), MIT Press, 1999. [Postscript (gz)] [PDF]

**[4]** T. Joachims, Training Linear SVMs in Linear Time, Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), 2006. [Postscript (gz)] [PDF]