SVM^struct

Support Vector Machine for Complex Outputs

Author: Thorsten Joachims <thorsten@joachims.org>
Cornell University
Department of Computer Science

Version: 2.00
Date: 03.07.2004

Overview

SVM^struct is a Support Vector Machine (SVM) algorithm for predicting multivariate outputs. Unlike regular SVMs, which consider only univariate predictions like in classification and regression, SVM^struct can predict complex objects like trees, sequences, or sets. Examples of problems with complex outputs are natural language parsing, sequence alignment in protein homology detection, markov models for part-of-speech tagging, and many more.

The sparse approximation algorithm implemented in SVM^struct is described in [1][2]. The implementation is based on the SVM^light quadratic optimizer [3].

Existing Instantiations

SVM^struct can be thought of as an API for implementing different kinds of complex prediction algorithms. Currently, we have implemented the following learning tasks:

SVM^align: Learning to align sequences. Given examples of how sequence pairs align, the goal is to learn the substitution matrix as well as the insertion and deletion costs of operations so that one can predict alignments of new sequences.
More information and source code.
SVM^pcfg: Learns a weighted context free grammar from examples. Training examples (e.g. for natural language parsing) specify the sentence along with the correct parse tree. The goal is to predict the parse tree of new sentences.
More information and source code coming soon.
SVM^hmm: Learns a Markov model from examples. Training examples (e.g. for part-of-speech tagging) specify the sequence of words along with the correct assignment of tags (i.e. states). The goal is to predict the tag sequences for new sentences.
More information and source code coming soon.

Please let me know, if you want me to add your implementations to this list.

Source Code for Implementing your Own Instantiation

Instead of using one of the existing instantiations of SVM^struct listed above, you can implement your own. SVM^struct contains an API that let's you specialize the general sparse approximation training algorithm for your particular application. Referring to the algorithm as presented in [1], you merely need to provide the code for the following:

A function for computing the feature vector Psi.
A function for computing the argmax of the linear discriminant function.
A loss function.

You can download the source code of the algorithm and the API from the following location:

      http://kodiak.cs.cornell.edu/svm_struct/current/svm_struct.tar.gz

The archive contains the source code of the most recent version of SVM^struct as well as the source code of the SVM^light quadratic optimizer. Unpack the archive using the shell command:

      gunzip –c svm_struct.tar.gz | tar xvf –

This expands the archive into the current directory, which now contains all relevant files. You can compile SVM^struct with the empty API using the command:

      make

It will output some warnings, since the functions of the API are only templates and do not return values as required. However, it should produce the executable svm_struct_learn. To implement your own instantiation, you will need to edit the follwoing files:

svm_struct_api.c
svm_struct_api_types.h

Both files already contain empty templates. The first file contains the type definitions that need to be changed. PATTERN is the structure for storing the x-part of an example (x,y), LABEL is the y-part. The learned model will be stored in STRUCTMODEL. Finally, STRUCT_LEARN_PARM can be used to store any parameters that you might want to pass to the function. The second file contains the function you need to implement. See the documentation in the file for details. You might also want to look at the other instantiations of SVM^struct for examples of how to use the API. A more detailed tutorial and an alpha version of SVM^struct will be available soon.

How to Use

Compiling creates the executable svm_struct_learn, which performs the learning. Usage is much like SVM^light. You call it like

      svm_struct_learn -c 1.0 train.dat model.dat

which trains and SVM on the training set train.dat and outputs the learned rule to model.dat using the regularization parameter C set to 1.0. The format of the train file and the model file depend on the particular instantiation of SVM^struct. Other options are:

General options:
         -?          -> this help
         -v [0..3]   -> verbosity level (default 1)
         -y [0..3]   -> verbosity level for svm_light (default 0)
Learning options:
         -c float    -> C: trade-off between training error
                        and margin (default 0.01)
         -d [1,2]    -> L-norm to use for slack variables. Use 1 for L1-norm,
                        use 2 for squared slacks. (default 1)
         -r [0..]    -> Slack rescaling method to use for loss.
                        1: slack rescaling
                        2: margin rescaling
                        (default 1)
         -l [0..]    -> Loss function to use.
                        0: zero/one loss
                        (default 0)
Optimization options (see [1]):
         -q [2..]    -> maximum size of QP-subproblems (default 10)
         -n [2..q]   -> number of new variables entering the working set
                        in each iteration (default n = q). Set n eps: Allow that error for termination criterion
                        (default 0.01)
         -h [5..]    -> number of iterations a variable needs to be
                        optimal before considered for shrinking (default 100)
Output options:
         -a string   -> write all alphas to this file after learning
                        (in the same order as in the training set)
Structure learning options:
         -u* string  -> custom parameters that can be adapted for struct
                        learning. The * can be replaced by any character
                        and there can be multiple options starting with -u.

The options starting with -u are those specific to the instantiation. For more details on the meaning of these options consult reference [1].

References

[1] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces, ICML, 2004.

[2] T. Joachims. Learning to Align Sequences: A Maximum Margin Approach, Technical Report, August, 2003.

[3] T. Joachims, Making Large-Scale SVM Learning Practical. Advances in Kernel Methods - Support Vector Learning, B. Schölkopf and C. Burges and A. Smola (ed.), MIT Press, 1999.

SVMstruct