Introduction to Computer Vision (CS4670)
Related Work
Surprisingly, the amount of literature dealing with image enhancement using combination of flash and no-flash image pairs is not substantial and we referred to only two papers during the implementation of this project.
In order to get a start and outline a basic algorithm for the entire process, we referred to a paper by Eisemann and Durand [2] titled "Flash Photography Enhancement via Intrinsic Relighting". This paper provided all the basic information required for our project and also dealt with more advanced decoupling and recombination of images which we did not implement. We also found the paper "Digital Photography with Flash and No-Flash Image Pairs" by Agrawala et al.[1] to be a very good resource for literature regarding what we are attempting. This paper also included several techniques for enhancing the quality of the final result such as white balancing and red-eye correction which we did not attempt due to time constraints.
Our Approach
and Algorithm
Input Image Capture
The input for our program is a flash and no-flash image pair that have captured the same scene. In order to achieve efficient results, both images have to be aligned perfectly and we had two options for alignment of the flash no-flash pair. We could either align the two images by performing some sort of feature matching on the two captured images or we could try and ensure that the images are accurately aligned by getting the N900 phone to take the two required pictures very quickly on pressing the camera button just once. We decided to use the latter approach and this was achieved with the help of code available at http://fcam.garage.maemo.org/. The phone settings were tweaked such that a single click of the phone's camera button quickly captures two images - the first one with the flash on and the second with the flash off. The exposures of both images were set to be the same and the relatively instantaneous capture of the two images ensured that realignment was unnecessary.
Main Flow of
Algorithm
Decoupling of Images
The first step is to decouple both the images into intensity and color. The color layer is the original pixel values divided by the intensity. We then decouple the intensity images of both images into layers corresponding to detail and illumination. So, in order to achieve this we use a bilateral filter whose weights depend on a Gaussian f on the spatial location and a Gaussian g on the pixel intensity difference. For an input image I, the output of the bilateral filter for a pixel s is given by:
where k(s) is a normalization term. We use a spatial variance of 1.5% times the diagonal of the image and an intensity variance of 0.4.
Similar to a Gaussian, a bilateral filter smoothes an image, however it is also different in that a bilateral filter retains sharp edges thus avoiding halos around strong edges.
We implemented our own bilateral filter to achieve this, but when running on the phone we found OpenCV's bilateral filter to be much faster and so we switched to the built-in function available in OpenCV. The bilateral filter is applied to the intensity image of the flash image and the intensity image of the no-flash image. The output of the bilateral filter is a large-scale image and using this image and the intensity image we obtain detail images of both the flash and non-flash images.
The only images of relevance to us are the large scale image of the no-flash image and the color and details images of the flash image (along with the two input images of course). The color and details images of the no-flash image and the large scale image of the flash image are discarded. The detail and color layer of the flash image is used because it is sharper and white balance is more reliable. The large scale of the no-flash image is used to retain the tone of the original ambient lighting. If we recombine the layers now, we will get a very naive recombination. However, this image will have all the extra shadows that the flash produces in the flash image. So, the next essential step is shadow detection.
Shadow Detection
As mentioned earlier, the flash produces extra shadows which are not part of the original scene. So, these shadows have to be isolated to achieve good results.
There are two types of shadows - umbra and penumbra. Umbra corresponds to shadows that are completely black and penumbra corresponds to shadows that are only partially and not completely black. Two different techniques have to be used for detection of the two types of shadows.
1. Umbra
Detection
The difference image between flash and no-flash is calculated and ideally this should exactly tell us the light received from the flash. However due to indirect lighting, shadows do not exactly correspond to pixels where the value in the difference image is equal to zero. So, histogram analysis is performed to determine shadow pixels. We compute the histogram of the difference image. 256 bins are used and Gaussian blur of variance 2 bins is applied to the histogram. A threshold is dynamically set according to the number of pixels in the input images and pixels that are less than the threshold are classified as umbra pixels.
2. Penumbra
Detection
In order to detect penumbra pixels, we first calculate the magnitude of the gradient of the flash and non-flash images. To remove noise we perform blurring using a Gaussian of variance 2 pixels. Possible penumbra pixels are those pixels where the gradient is stronger in the flash image. From among these candidate penumbra pixels we need to keep only those pixels that are spatially close to the umbra pixels. So, in order to get the required penumbra pixels we convolve the umbra map with a box filter using a square neighborhood that has a size equal to 1% of the image's diagonal.
Color Correction
We now have shadow mask that corresponds to the extra shadows produced in the flash image. The shadow mask is then feathered to make it neat. To provide color to the extra shadow region, we experimented with several different techniques. A very naive method that we tried was to simply fill in the shadow region with color from the original flash image. This produced inaccurate results and so the next technique we tried was the inpainting method available in OpenCV. This proved to be even worse and so finally we decided to perform a local color correction that simply copies colors from nearby illuminated regions in the flash image.
Using a technique that is somewhat similar to the bilateral filter, the color of a shadow pixel is computed as a weighted average of its neighbors in the flash image. This weight depends on three terms. In addition to a spatial Gaussian and an intensity difference Gaussian, there is also a binary term that excludes pixels in the shadow. Our color correction doesn't work as well as it should, but it does give a colored appearance to otherwise colorless shadow pixels. Recombination is now complete.
Results
Fig 1. Test on image pair from the original paper [1]. (Top: Flash image, Middle: No-flash image, Bottom: Result)
Fig 2. Comparison between color of flash (left) and color of no-flash (right) image. While no flash image lost most of puppet’s color, the flash image was relatively rich in color of the puppet.
Fig 3. Close up comparison between flash(left) and no-flash intensity. The intensity of flash is focused on dolls, while the no-flash image retained the overall intensity of the whole scene.
We ran our algorithm on pairs of images we downloaded from original paper’s website. The flash image on antique dolls (Fig 1) was rich in details and color (Fig 2) of dolls; however, far away background was rarely visible (Fig 1). The no flash image retained intensity map of the overall scene including background. Our combined result was successful in that it retained the intensity of background from no-flash image, and good color from the flash image. However, on this image pairs, shadow was not detected properly.
Fig 4. Another test on image pair from the original paper [1]. (Top: Flash image, Middle: NoFlash image, Bottom: Result)
Fig 5. Flash shadow color correction result. (Left: result without shadow detection, Middle: with shadow detection without color correction, Right: after color correction).
On the second pair of images (Fig 3) from the original paper, result was successful in capturing color information from the flash image, and fairly successful in shadow detection. The flash image had fairly distinct shadow that is absent from the no-flash image, and our algorithm successfully detected the shadow and replaced those pixels with weighted sum (using color correction method described in previous section) of non shadow neighbors of flash image. However, there is a slight seem visible at the boundary of shadows because the non shadow neighbors of original flash image had very high intensity.
Fig 6. Test run on dark scene ran on the phone. (Top: Flash image, Middle: NoFlash image, Bottom: Result)
We started to run our algorithm on the phone. The algorithm captures flash image, captures the no-flash image right after the flash photo is shot, and runs the algorithm described in previous sections. Our result on shots of ordinary brightly lit scene was not very good because the brightly lit scenes already convey detail and color in absence of flash light. However, our result on dark scene (Fig 6) was fairly successful. The flash image had lot of detail of the scene while the headset appeared to be too bright as compared to the keyboard. In contrast, you can barely see a keyboard in no-flash image. After combination, we can still see the details of keyboard and the headset did not appear to be too bright.
Implementation
·
Ashwin Ajit: Image decoupling,
Bilateral Filtering, Shadow (Umbra / Penumbra) Detection, Recombination, FCam integration
·
Jin Hyuk
Cho: Shadow Treatment, Color Correction,
Recombination, Histogram Analysis / Thresholding /
Blurring histogram.
·
Note on Bilateral
Filtering – On current implementation, we are using OpenCV’s
bilateral filter algorithm. However, we fully implemented the bilateral filter
ourselves.
·
FlashNoFlash.cpp contains most of the relevant codes.
Strengths
·
Realignment not necessary, so computation time
is greatly reduced
·
Ambient color tone is retained in final image
·
High frequency detail is not lost
·
Noise in the non-flash image doesn't affect results
·
Extra shadows produced by flash are detected
Weaknesses
·
Shadow detection doesn't work accurately when
the no-flash image has higher intensity than the flash image
·
Color correction on shadow doesn't produce
visually accurate results
Possible Extensions
·
The available literature describes further image
enhancement by performing white balancing,
red-eye removal etc. These could be explored.
·
Another extension we were considering was to capture
two sets of flash no-flash image pairs where each pair is captured using a
different exposure. Recombining these four images to produce a sort of HDR -
Flash NoFlash combination image might produce very
interesting results.
References