SampleCount version 1.0
Cornell University, 2007

April 9, 2007

See publication 'From Sampling to Model Counting' in IJCAI-07 for a
detailed description of SampleCount. Briefly, SampleCount is a
randomized algorithm for obtaining lower bounds on the solution counts
of propositional SAT formulas. It uses sampling (through SampleSat) to
identify balanced variables and variable pairs, and uses them to
simplify the formula iteratively. The code is built upon the code of
Approxcount version 1.2 (Wei and Selman, 2005), and uses Cachet
version 1.2 (Sang et. al, 2004-2005) to compute the residual count
after sufficiently many variables have been set.

COMPILATION:
  samplecount.sh : main executable shell script; uses approxcount and cachet
  approxcount    : type "make" in the main directory; creates version 1.2L
                   with lowerbound computation, equivalence testing, etc.
  cachet         : version 1.2 binary included

The main executable is the script samplecount.sh, although it often
helps to first experiment directly with "approxcount -lowerbound ..."
using variations of the default options suggested below. The exact
command line for approxcount used by samplecount.sh is printed out
when executing samplecount.sh. Without any parameters, samplecount.sh
reports its usage.

USAGE: samplecount.sh <cnffile> <iterations> <samples> <slack>
                      [options for approxcount]

To obtain a correctness confidence of, say, 99%, the product of
<iterations> and <slack> should be at least 7. It is often convenient
to have 7, 4, or 2 iterations with slack being 1, 1.75, or 3.5,
respectively. The exact tradeoff is discussed in the publication
mentioned above. The number of samples can be around 20 on the low
side to around 100 on the high. More samples generally gives more
stable bounds over different iterations, but consume more time.

The default options for approxcount are: 
  -lowerbound -equiv -knuth -exact_count -100

Note that the options for Approxcount include options to
SampleSat/Walksat for sampling, and will often need to be set based on
the domain in order to obtain good performance. See approxcount -h for
various parameter tuning options.

To simply print the samples generated by SampleSat and exit, use:
  approxcount -sampleonly -samples N [other-options]

One of the key parameter that needs adjustment is -cutoff, set by
default to 10,000. It determines the number of "flips" used by
SampleSat to sample each solution using Walksat. The more the flips,
the longer it takes to sample but the better the sample quality (i.e.,
uniformity). For hard instances, the cutoff should be increased so
that approxcount is able to find the requested number of samples. This
can be judged by running "approxcount -lowerbound ..." directly rather
than samplecount.sh on the formula and looking for the number of
successful tries reported by approxcount.

Another useful parameter is -noise, set by default to 50 (out of 100).
Finally, -exact_time_limit determines how much time is given to Cachet
to compute the exact count on the residual formula before starting the
simplification process again; the default is 100 seconds.

USAGE EXAMPLES:

  samplecount.sh example.cnf 2 20 3.5 -cutoff 5000

  approxcount -lowerbound -equiv -knuth -exact_count -150 \
     -samples 40 -lb_slack 3.5 -cutoff 8000

---

Please report bugs and send comments/questions to
  Ashish Sabharwal, sabhar at cs dot cornell dot edu.

