CS 631 Assignment #3
Due Date: May 17th 1999
Description
The objective of this assignment is to decode an MPEG stream and output an estimate of
the number of people in the sequence. You are required to write a program that takes an
MPEG sequence as an argument.
and performs the following intermediate steps:
There are two parts to this assignment. In Part 1, we specify what you need to compute
(much as in assignments #1 and #2). In Part 2, we ask you to do a short individual
research project, hopefully making use of what you did in Part 1.
Part 1.
A) For every frame in the sequence do the following:
- output an image of skin blobs
Procedure:
- Implement the Fleck, Forsyth and Bregler skin finder algorithm
(http://www.cs.hmc.edu/~fleck/naked-skin.html)
- Run connected components, algorithm marking each skin patch with a unique grayscale
number.
- Output a pgm format image representing all the skin patches in the frame. For example,
every pixel in skin patch 1 may must be set to 41 on the grayscale, while every pixel in
skin patch 2 may must be set to 52 on the grayscale.
- output an estimate of the number of individuals in the frame
Procedure:
- You are free to implement whatever means to estimate the number of individuals. This can
even be based on the information obtained from part 2.
B) For every pair of successive frames do the following:
Procedure:
- Run image differencing between the frame pair.
- Perform tThreshold the difference, then dilate and erode operations to eliminate noise.
- Run connected components algorithm to mark moving patches.
- Output a pgm format image representing all moving patches in the frame pair (as in part
1). Again, the pixels in the first moving patch must be labeled 1, the second 2, etc.
(Note: All the parameters for part 1 are unspecified and you are free to select them)
Part 2.
For the research portion of this assignment, you must estimate the number of distinct
people present in the entire sequence. The way in which you compute this is up to you; we
expect you to be creative. We suggest that you make use of the results from Part 1, but
how you do this is up to you. Note that you may assume that the video you will process
came from Philips 101 on the same day as the test images we are giving you. You may also
assume that the camera is stationary, and that no people enter or exit during a sequence.
For part 2, you must hand in a coherent 1-page description of how you count the number
of people. This description should be technical enough for us to understand what you are
computing.
Here a few suggestions you might wish to pursue:
- Look for roughly circular moving skin blobs that may be faces, using the output of part
1, and possibly modifying it as appropriate.
- Check to see if the area on the top of the "face" is darker (like most
peoples hair)
- Build a detector for the color of the Philips 101 chairs. Then use the computed chair
area to figure out where faces should be, and see if there is something face-like there.
- Implement a temporal median filter to estimate the background when the people move.
- Implement an eye-finder, perhaps using template matching.
Implementation
For this assignment, you are required to use DALI which is a high performance library
of routines for manipulating video, audio, and image data. You will need to download the
source code for DALI from the following web-site and compile it under WIN/NT.
http://www.cs.cornell.edu/dali/
The installation/compilation procedures are provided on the web-site. You may also have
to install TCL/TK in order to properly compile DALI. The TCL/TK can be downloaded from
http://www.scriptics.com
Note, however, you are not allowed to use TCL/TK for interfacing with DALI. The
entire program must be standalone (with the exception of DALI) and must be written in
C/C++.
Few Helpful Hints For Compiling DALI
- In makefile.vc following changes may be required:
- TCLSH = $(TCLDIR)\bin\tclsh80
- Change TCL80VC.LIB à TCL80.LIB
- Change TK80VC.LIB à TK80.LIB
- Mkdir obj
- Run nmake /f makefile.vc
- Run nmake /f makefile.vc install
Few Helpful Hints for Compiling Examples
- In makefile.vc (located in for e.g. \dali\examples\lib\canny) following changes may be
required:
- Search and replace ALL "/" to "\"
- ROOT = ..\.. (assuming examples directory is located at \dali\examples)
- Add msvcrt.lib to LIBS à LIBS = $(GUILIBS) $(DALILIBS)
msvcrt.lib
- You need to make similar changes as above to makefile.vc in \dali\examples\lib
- Run nmake/f makefile.vc in for e.g. \dali\examples\lib\canny after making the
above modifications.
Required Inputs/Outputs
The command-line must look like the following:
estimator <mpeg-sequence>
The output should consist of the following:
- The skin patches should be created in <mpeg-sequence>_fr#.pgm. The #
corresponds to the frame # in the MPEG sequence.
- The moving blobs should be created in <mpeg-sequence>_frp#.pgm for each
frame pair. Consequently, the # corresponds to the frame pair number.
- All The estimated number of peoples should be output to both stdout and <mpeg-sequence>_est.txt.
The last number in the text file must be the final estimate.
- For example, if the input mpeg file named students.mpg consists of 5 frames, you
should output students_fr1.pgm thru students_fr5.pgm, students_frp1.pgm
thru students_frp4.pgm and finally students_est.txt containing a single 6
numbers. Also, feel free to show some progress indication while your program executes.
Instructions
This assignment must be done in groups of 2 students.
You should use C or C++ and please make sure that your code compiles on the Windows
platform.
We will provide you with a few sample MPEG sequences to test your programs. The minimum
requirement is for your implementation to work on these samples.
You are required to send "hw3.zip" file as an attachment containing the
following:
- Source code and executables (estimator.exe)
- Readme.txt containing instructions on how to compile and run your program.
- Also in the Readme.txt or a Word Document, include a brief section on how you went about
estimating the number of people in the mpeg sequence.
- Makefile (that can be run using nmake) or MS VisualStudio project files. Do not include
object files if not necessary for compilation.
- How to submit?
- Send hw3.zip as an attachment to warkhedi@cs.cornell.edu
- The subject header of your e-mail must contain the string "CS631: HW3"
(ignoring the whitespaces & case).
- The assignment should must be submitted on or before 11:59pm Mar 12thMay 17th
1999.
- Grading
- Part 1 of Tthis assignment will be graded on a scale of 10.
- The grade for part 1will primarily be based on the accuracy of your results.
- The part 2 (the people counter) will be given a letter grade, based on the quality of
your research. We are more concerned with the quality of your ideas than with how well
they perform at the people-counting task, since the task is so difficult.