API FOR ENERGY MINIMIZATION

Version 1.1
API written by Olga Veksler, Ramin Zabih, and Vladimir Kolmogorov 
olga@csd.uwo.ca, rdz@cs.cornell.edu, vnk@cs.cornell.edu

Documentation, such as it is, written 10/16/05 by RDZ based on comments in
Olga's code.

==============================================================================
OVERVIEW

This interface allows a vision researcher to specify an energy minimization
problem in a standardized format, and to easily call a range of different
optimization methods.  In the current distribution, the optimization methods
that are implemented include graph cuts and ICM.  We expect the
implementations of TRW-S and loopy belief propagation will shortly be
available.

If you are interested in solving a vision problem using energy minimization,
i.e. you want to use this API, look at the file "example.cpp".  The rest of
this document provides some additional description of how to use this API.  If
you have designed an energy minimization algorithm and wish to provide an
interface for it, you should also look at ICM.h and ICM.cpp, since you will
need to provide similar functionality.

==============================================================================
BASIC DATA STRUCTURES

The basic data structure used in this API are the classes EnergyFunction and
MRF, described in the file "mrf.h" and implemented in "mrf.cpp".  The class
EnergyFunction is used to provide an abstract version of your energy function,
while the class MRF is used to apply a particular optimization method to an
EnergyFunction.

As a user of the API, your main task is to create an EnergyFunction object.
There are several ways to do this, depending on how general your energy
function is.  The file "example.cpp", illustrates these different ways in the
four functions:
      generate_DataARRAY_SmoothFIXED_FUNCTION();
      generate_DataARRAY_SmoothTRUNCATED_LINEAR();
      generate_DataARRAY_SmoothTRUNCATED_QUADRATIC();
      generate_DataFUNCTION_SmoothGENERAL_FUNCTION();
As a rule, the more general your energy function, the slower the optimization
methods, so it's in your interest to use the least general description that
you can.

An energy function is defined on labelings of pixels (i.e., each pixel must be
given a label).  In the current interface, labels are of type int, and
everything is done with integer arithmetic.  Energy functions defined on other
types of labels (such as floating point numbers) must perform their own
conversions to and from integers.  We consider energy functions that are the
sum of two parts: a data cost and a smoothness cost.  The data cost is the sum
over all pixels of the cost for each pixel to have its label.  The smoothness
cost is the sum over all pairs of adjacent pixels of their pair of labels.
The current implementation in "mrf.h" has a special constructor for
4-connected 2D grids, and a general constructor for arbitrary neighborhoods.
(At the moment, not all optimization methods implement the general
constructor.)

Thus, to specify an energy function all you need to do is to specify the data
cost and the smoothness costs.  These are represented (unsurprisingly) with
the classes DataCost and SmoothnessCost, which are also defined in "mrf.h".

==============================================================================
SPECIFYING THE DATA COSTS

As documented in "mrf.h", data costs can be specified in two different ways,
either as an array or as a function.  The first three energy function examples
      generate_DataARRAY_SmoothFIXED_FUNCTION()
      generate_DataARRAY_SmoothTRUNCATED_LINEAR()
      generate_DataARRAY_SmoothTRUNCATED_QUADRATIC()
represent the data costs with an array, while the last one
      generate_DataFUNCTION_SmoothGENERAL_FUNCTION()
represents the data costs with a function.


==============================================================================
SPECIFYING THE SMOOTHNESS COSTS

As documented in "mrf.h", smoothness costs can be specified in several different
ways.  For many smoothness costs you can use the constructor 
       SmoothnessCost(smoothExp,smoothMax,lambda)
to represent V(l1,l2) = lambda * min (|l1-l2|^m_smoothExp, m_smoothMax).  Note
that since we use integers, m_smoothmax = 1 corresponds to the Potts model.
For a 4-connected 2D grid, you can also use spatially varying weights by
calling
        SmoothnessCost(smoothExp,smoothMax,lambda,hWeights,vWeights)
Here, hWeights and vWeights must be arrays of size width*height in row major
order and are multipliers (i.e., the value of V(l1,l2) above is multiplied by
the appropriate element of hWeights or vWeights, depending on whether the
pixels are horizontal or vertical neighbors).  The variable hWeights[x,y]
holds the multiplier for the edge between pixels (x,y) and (x+1,y), while
vWeights[x,y] holds the multiplier for  the edge between pixels (x,y) and
(x,y+1). 

For more general smoothness costs, you can represent the smoothness costs by
an arbitrary input array V, and call
       SmoothnessCost(V)
or, with a 4-connected 2D grid and spatially varying multipliers, you can use
       SmoothnessCost(V,hWeights,vWeights).
Finally, if the smoothness term is specified by a general function, use the
constructor 
       SmoothnessCost(costFn)
See "mrf.h" for more details about the types of V and costFN, or see
generate_DataFUNCTION_SmoothGENERAL_FUNCTION() in "example.cpp" for an example
of how to use a general smoothness cost function.

==============================================================================
MINIMIZING THE ENERGY

To actually minimize the energy you need to instantiate an algorithm by
creating a structure that inherits from MRF.  Each algorithm take a width,
height, number of labels and energy function (for a 4-connected 2D grid).  A
general neighborhood system is also possible, by using a different constructor
for the MRF class and by calling setNeighbors (see "mrf.h" for details).

The function main() in "example.cpp" creates algorithms to solve the example
energy function using ICM and two different graph cut algorithms (expansion
moves and swap moves).  You must first call initialize() and
clearAnswer(). After that, each call to optimize() will run for some number of
iterations, defined by the first parameter.  Each optimization algorithm has
its own idea of what an interation is, but the intent is that a single
iteration corresponds to a reasonable attempt to obtain a good solution.

Note that the algorithm maintains its own internal state, so if you want to
run 5 interations you can just ask it to do 1 iteration 5 times, or you can
ask it to do 5 iterations.  The only real difference is that if you ask the
algorithm to do multiple iterations you cannot read the intermediate answers
or their energy.  The solution can be read using either getAnswerPtr() (to
read it as an array) or getLabel (to read individual pixels).  You can compute
the energy of the current answer using totalEnergy(), or its two components
using dataEnergy() and smoothnessEnergy().

To set the initial labeling you should use setLabel(), to change one pixel at
a time, or you can modify the array returned by getAnswerPtr().

==============================================================================
OPTIONS

Some minimization algorithms actually compute a lower bound on the global
minimum; this can be obtained by calling lowerBound(), which by default
returns 0.  In addition, not all optimization algorithms handle all energy
functions (though many do, including all the ones we've implemented).  It is
too expensive to check this by default, but the function checkEnergy() will
determine if the optimization method is suitable.

==============================================================================



