Yuandong Zhuang, Zhe Yang
We are trying to recognize shapes and texture of some geometric objects in a simple sketch picture and convert the sketch into a physically-based animation, and even extend it into an interactive physically-based game.
We implemented most of the algorithms ourselves. We used OpenCV library mainly for filtering the image and vector/matrix operations. Our solution consists of several parts as described in detail below.
We first converted the images to gray-scale, smoothed them using cvSmooth to reduce noise, and then used cvAdaptiveThreshold with a window size of 9 to threshold the image and produce a pure black and white binary image. We chose to use this adaptive threshold method because the illumination often changes across the image.
Sample input photos and threshold results
We assume that the objects have closed boundaries and are disjoint. So we implemented an algorithm similar to the “Paint Bucket Tool” in Photoshop to segment the objects. Starting from the top left corner (assuming it is background), we “paint” each white pixel reachable from that pixel as backgound. After that, each unpainted “hole” in the image forms a object. We then paint each object with its assigned ID. For each object, we also create a new rectangle layer and set the alpha channel of the outer area to be 0, which is used later as textures for that object.
Example of objects
We constructed a set of bar filters in 8 different orientations as described in this page. We also added a Laplace of Gaussian filter (using cvLaplace) for the dot pattern. Then for each object, we compute the histogram of its response rates to the filters and match the histogram (using SSD) with a set of predefined pattern histograms.
Example of bar filters
Example of response rates
Shape Fitting
We computed the gradient orientation and magnitude of the objects by using cvSobel on the image with background colored black and each object white. Then we computed a histogram of gradient orientation for each object, counting only the points with strong magnitude and sorting those points into the bins. If the orientation distribution is uniform, we fit it to a circle. If not, we find its dominant orientations (local maximums, each should correspond to an edge), and fit a line for each edge using RANSAC. We then compute the intersection between adjacent lines and use these points and the object’s vertices. This method works well for convex shapes but cannot handle complex non-convex shapes.
We used Chipmunk Physics Engine. The following properties are determined for each object:
We have achieved satisfactory results. In particular, our program can correctly segment the objects and fit them into simple geometric shapes. Pattern matching works well for larger objects, although sometimes it fails to recognize patterns of small objects. This can be improved by using filters at different scales and using training patterns rather than just hard coding the expected responses.
The following are Youtube links for three demos.
http://www.youtube.com/watch?v=hAsol3c-fT4
http://www.youtube.com/watch?v=_-ELouuL-pc
http://www.youtube.com/watch?v=i5Kd8wN_yEc
We discussed and figured out details of the algorithms together. Yuandong worked on setting up libraries, code framework, segmentation and physics engine integration. Elena focused on pattern recognition and shape fitting.