Sketch2physics

CS 4670 Final Project Report

Yuandong Zhuang, Zhe Yang

The Problem

We are trying to recognize shapes and texture of some geometric objects in a simple sketch picture and convert the sketch into a physically-based animation, and even extend it into an interactive physically-based game.

Solution

We implemented most of the algorithms ourselves. We used OpenCV library mainly for filtering the image and vector/matrix operations. Our solution consists of several parts as described in detail below.

Pre-Processing

We first converted the images to gray-scale, smoothed them using cvSmooth to reduce noise, and then used cvAdaptiveThreshold with a window size of 9 to threshold the image and produce a pure black and white binary image. We chose to use this adaptive threshold method because the illumination often changes across the image.

Sample input photos and threshold results

Segmentation

We assume that the objects have closed boundaries and are disjoint. So we implemented an algorithm similar to the “Paint Bucket Tool” in Photoshop to segment the objects. Starting from the top left corner (assuming it is background), we “paint” each white pixel reachable from that pixel as backgound. After that, each unpainted “hole” in the image forms a object. We then paint each object with its assigned ID. For each object, we also create a new rectangle layer and set the alpha channel of the outer area to be 0, which is used later as textures for that object.

Example of objects

Pattern Matching

We constructed a set of bar filters in 8 different orientations as described in this page. We also added a Laplace of Gaussian filter (using cvLaplace) for the dot pattern. Then for each object, we compute the histogram of its response rates to the filters and match the histogram (using SSD) with a set of predefined pattern histograms.

Example of bar filters

Example of response rates

Shape Fitting

We computed the gradient orientation and magnitude of the objects by using cvSobel on the image with background colored black and each object white. Then we computed a histogram of gradient orientation for each object, counting only the points with strong magnitude and sorting those points into the bins. If the orientation distribution is uniform, we fit it to a circle. If not, we find its dominant orientations (local maximums, each should correspond to an edge), and fit a line for each edge using RANSAC. We then compute the intersection between adjacent lines and use these points and the object’s vertices. This method works well for convex shapes but cannot handle complex non-convex shapes.

Physics Engine Integration

We used Chipmunk Physics Engine. The following properties are determined for each object:

Movable or fixed: hollow object is movable while textured object remain still.
Geometry: determined from the vertex positions computed by shape fitting.
Texture: used the image layer created after segmentation.
Mass and moment of inertia: estimated as proportional to the “area” of the object.
Fraction and elasticity: determined according to texture pattern.

Result

We have achieved satisfactory results. In particular, our program can correctly segment the objects and fit them into simple geometric shapes. Pattern matching works well for larger objects, although sometimes it fails to recognize patterns of small objects. This can be improved by using filters at different scales and using training patterns rather than just hard coding the expected responses.

The following are Youtube links for three demos.

http://www.youtube.com/watch?v=hAsol3c-fT4

http://www.youtube.com/watch?v=_-ELouuL-pc

http://www.youtube.com/watch?v=i5Kd8wN_yEc

Work Division

We discussed and figured out details of the algorithms together. Yuandong worked on setting up libraries, code framework, segmentation and physics engine integration. Elena focused on pattern recognition and shape fitting.