The CU30 Project

In the fall semester of 1999, Professor Toby Berger and one of his EE undergraduate students, Aron Rosenberg '02 decided that porting CU30 to GNU/Linux would be an interesting excercise. Shortly afterwards, Andrew Dodd '02, another student of Prof. Berger's, joined the project. Some work was done in the fall semester, but most of the effort was done in Spring 2000. Midway through that semester, Ben Luk, an undergraduate student with detailed knowledge of the Windows implementation, joined the porting effort. Work progressed quickly, with video transmission and reception accomplished for the first time in March.

CU30-L couldn't have been written without the work done by the GNU Project. While CU30-L used lots of tools developed by GNU, it relies heavily on the Linux kernel for several of its interfaces.

The CU30 Algorithm

CU30 is the name of a new video conferencing algorithm being developed at Cornell University Electrical Engineering department. The goal is to deliver a software-only video conferencing solution that runs at 30 frames per second, rather than the 15 frames per second or less that most other video conferencing algorithms allow when not using hardware compression.

Below is the patent abstract on the video method:
A method and system for compressing and decompressing video image data in real time employs thresholding and facsimile-based encoding to eliminate the need for computationally intensive two-dimensional transform-based compression techniques. The method operates first by forming a difference frame which contains only information pertaining to the difference between a current video image frame and a computed approximation of the video image frame. The difference frame is fed to a thresholder which categorises each pixel in the frame as being either in a first set having intensities above or at a preset threshold, or a second set having intensities below a preset threshold. A facsimile-based compression algorithm is then employed to encode the first set of above or at threshold pixel locations. To compress the intensity data for each above or at threshold pixel, a quantizer and lossless encoder are preferably employed, with the quantizer serving to categorize the intensities by groups, and the lossless encoder using conventional coding, such as Huffman coding, to compress the intensity data further. Various techniques may be employed with the embodiments of the invention to adjust the actual amount of compressed data generated by the method and system to accommodate communication lines with different data rate capabilities.

The CU30 algorithm works like this - This is very basic.

Grab -> Calc Difference Frame -> Rate Control -> Quantize -> Huffman code -> network

Image Capture: First an image is captured by a camera or other source at 30 frames-per-second (FPS) in YUV 411P format. The app then "consumes the frame" based on whether it is an inter or intra frame. An Interframe is encoded stand-alone and is the first frame sent out in the stream. Whereas an Intra frame is one that is calculated based on other frames.

Interframe Encoding: A delta (difference) frame is computed from the current frame only. For the first line in the frame, the difference in intensities between each pixel and the previous pixel is computed using a Bush-Hog curve method. For every other line, we then compute the difference between pixels on that line and those on the previous line again using the Bush-Hog method to determine where.

Intraframe Encoding: A difference frame is computer by calculating the difference of intensities at each pixel in the current frame and the previous frame. This frame is represented as a bi-level frame and a corresponding set of intensity vectors.

Encoding: All the pixels below either a closed loop rate control threshold or perceptual threshold curve are set to 0, and all the pixels above either threshold are quantized (will describe later). In this stage, the delta frame is represented by a sequence of symbols with each symbol representing a number of runblocks (number of continuous 0s) or the intensity for the pixel after quantization.

Huffman: The resulting sequence is compressed by a huffman encoder, so that the average length of the bit stream representing the sequence is minimized. Here is a more detailed look at what the huffman coder is doing and why it is so special in cu30. The following is the abstract from the patent.
Two software-only prefix encoding techniques employ encoding look-up tables to produce contributions to the encoded bit stream that are incremented in integral numbers of bytes to facilitate accelerated encoding rates at the expense of an acceptable trade-off in increased memory size requirements. The first technique, referred to as offset-based encoding, employs encoding tables which eliminate most of the bit-based operations that need to be performed by a prefix encoder without inordinately expanding memory requirements. In offset-based encoding, a Huffman table is employed which contains information for each number of bits by which the length of a Huffman word is offset from an integral number of bytes. The encoding method generates bytes of encoded data, even though the Huffman code has variable length code words for each symbol to be encoded. The second technique, referred to as byte-based encoding, employs a byte-based Huffman encoding table which operates even faster than the offset-based encoding scheme because it does not employ any bit-based operations at all; however, this is achieved at the expense of a considerable expansion in memory requirements.

The CU30 project based out of the DISCOVER lab has been funded by several sources. These include the National Science Foundation under grants 9632266 and 9980616. The DISCOVER Lab would also like to thank Symbol Technology for their donation of Wireless 802.11b equipment.

Update! - In 2002, the members of the CU30 project formed a company called SightSpeed for commercialization of the technology.

Remember to keep in mind that this method is patented. The two US patents that cover this are 5,973,626 and 5,740,278.