SFOUR toy example HOWTO


1) BINARIES

Copy appropriate binaries (svm_sfour_learn & svm_sfour_classify) into exec/ directory.

You can use the precompiled ones (from ../binaries directory) or compile them on your own
(by running make inside ../code directory).


2) TRAINING

Run "./svm_sfour_learn -c 1 -e 0.01 -w 0 trainidx mdl" inside exec/ directory.

This will train the model (saved as file "mdl") using the two examples listed in the
provided file "trainidx" with the parameter c set to 1 (and two additional required
settings explained in detail on the webpage).


3) SUMMARIZING

Run "./svm_sfour_classify testidx mdl out" inside exec/ directory.

Using previously learned model saved in file "mdl" we now summarize
the single document listed in the provided file "testidx" and save
the selected sentence (line) numbers into "out" file.


4) EVALUATING

The average loss of the predicted summaries is reported by svm_sfour_classify
in the second-to-last line of output which looks like "Average loss on test set: 0.xxxx".
Smaller loss denotes a better summary. If the reported average loss is equal to zero, 
then the predicted summary is the best possible one using only the extracted sentences
and greedy selection. Otherwise it denotes the difference between the best one and
the predicted one.

To obtain summary (in textual form) from the predicted sentence numbers
run "python ../../scripts/unpk4c2.py testidx out" inside exec/ directory.
This will generate a set of files (named "out<ID>") containing summaries 
for the documents listed in "testidx" using the predicted sentence numbers
in the previously generated "out" file.

If you want to evaluate the performance using ROUGE score, you must obtain
the ROUGE package and place it inside ../RELEASE-1.5.5 directory.
Then you can run "python ../../scripts/roug.py" inside exec/ directory
to obtain the score.


