CS 478 Machine Learning
Project Report Format
Your report for the final project should not exceed 8 single-spaced
pages (including references) using 11pt font with at least 1 inch
margins. Line spacing should be reasonable (~1.25-1.5).
In addition, you will be asked to submit your code.
Below are guidelines on how to write-up your report for the final
project (it looks long, because it is detailed. you can use it as
a scaffold for your report).
Try to adhere to these general guidelines as much as you can
(the actual format is flexible, as some of these sections may not
be relevant to your project. you can ignore the sections that are
irrelevant in your case, or you can add other sections)
-
Introduction
Specify abstractly the problem you are working on, and give a brief
background of the problem. It is desirable that you will discuss in
short traditional ways of solving it and related work, if you are aware
of such.
Give the motivation to work on this problem (why is it important?),
and the way you are going to address it.
-
Problem definition
Precisely define the problem you are addressing (i.e. formally specify
the inputs and outputs, but not in the context of the specific
learning system you are using).
-
The learning system
Specify the learning system you are going to use for this task. Why
did you choose this learning system?
-
The target function
Specify formally the problem you are addressing (i.e. specify the
inputs and outputs) in the context of the specific learning system you
are using. Especially:
-
Representation
The representation of the input, such as the mapping from instances to input
neurons in neural network, or chromosomes in genetic algorithms, or vectors
of attributes in decision trees.
-
System structure
For neural networks - how many layers, how many (or range) hidden
units. Activation functions, etc.
For genetic algorithms - fitness functions, how to select the most fit
hypotheses, etc. For HMMs - types of hidden states, general outline of model
structure, etc.
-
The learning algorithm
Specify the learning algorithm, and give an outline of the algorithm.
A psuedocode description of the algorithm can be useful.
If the algorithm was discussed in detail in class (such as the
back-propagation algorithm), you don't have to write all the
equations, yet give the general outline and the main equations.
-
Improvements/modifications
If you introduce modifications, variations or improvements, this is
the place to describe them. For example, the special genetic operators
for the genetic algorithm.
-
Experiments
Describe the experiments you did with the learning system in detail:
-
The data sets
Which data sets are you using to train your learning system? Type,
size, source. If you are using train/test/validation set, how did you
split the data between these sets?
-
Learning
How many runs? Did you play with the parameters (list parameter sets)?
What is the stopping criteria (convergence? number of iterations?
threshold accuracy?)
-
Evaluation/tests
Describe how you evaluated the performance of your system (accuracy of
classification, number of games won, etc). Did you use validation
set? test set?
-
Results
Present the quantitative results of your experiments. Graphical data
presentation such as graphs and histograms are frequently better than
tables.
-
Data analysis
Discuss the results you presented in the last section. How can the
results be explained in terms of the underlying properties of the
algorithm and/or the data.
What are the basic properties revealed in the data. If you are using
neural network, do you have an interpretation of the weights (feature mapping)?
-
Other learning systems
Many of you are using more than one learning system. If you are, repeat
briefly steps 2-5 for the other learning systems you are using.
-
Comparing learning systems
If you are using more than one learning system: What conclusions do
the results support about the strengths and weaknesses of one method
compared to other methods?
-
Summary
Briefly summarize the important results and conclusions presented in
the report. What are the most important points illustrated by your
work?
-
Future work
What are the major shortcomings of your current method? For each
shortcoming, can you propose additions or enhancements that would help
overcome it.