In this project I implemented various filters, hybrid images, gaussian laplacian stacks, and multi-resolution blending.
I convolved the original cameraman greyscal image with the finite difference operators to get the derivatives, calculated their magnitude (L2 norm), and used NCC score to find the best cutoff value such that the binarized edge image is the most similar to that generated by the Canny Edge Detector. It turns out that the optimal cutoff norm is 0.231 for the image. By default, I use the "same" mode for 2d convolution and use the "symmetric" boundary condition (i.e., padding). Note that convolving with Dx finds vertical edges and convolving with Dy finds horizontal edges. Below are the visualizations of the pipeline.
A two-step approach to reduce the noise in the edge image is to first blur the image through a Gaussian filter (I use kernel size = 5, sigma = 1), and then go through the same procedure as 1.1. As before, I use NCC to find the optimal cutoff, which is 0.093 for the blurred image. Below are the visualizations of the pipeline. We can see that with the blurring, there is much less noise in the final edge image (probably because the noise are caused by high frequency components of the image, which are filtered out by the Gaussian kernel).
Alternatively, we can retrieve the DoG kernel directly by convolving the Gaussian kernel with Dx and Dy, respectively. Below are results from this single-step fashioned approach. Note that the results are very similar to the two-step approach (although there are very minor differences due to boundaries). I use 0.093 as the cutoff norm value.
Let I be the original image and let G be I convolved with a 2D Gaussian Filter (again, I choose kernel_size = 5, sigma = 1). Then, the sharpened image is I + a * (I - G). This is equivalend to convolving with -a * KG + (1 + a) * I, where KG is the Gaussian Kernel and I is the identity kernel (only the center is 1, everywhere else is 0). Here I choose a = 1. Below are the visualizations of the pipeline and the comparison between the original and the sharpened image.
For the next image, I first blur it with a Gaussian filter (kernel_size = 5, sigma = 1) and then resharpen it with a = 1. Note that the resharpened image is close but still a bit different from the original image (since blurring it loss high frequency information). The Absolute Difference image shows the difference between the original and the resharpened image.
After aligning a pair of images by matching two points on each, I use a Gaussian filter to extract the low frequency components of the first image and the high frequency components of the second image (by subtracting the original image with the low pass filtered, i.e., Gaussian filtered, version). Afterwards I normalize the low & high frequency images and add them together (weighing them differently using a tubale ratio). Below are the visualizations of the pipeline for Derek $ Nutmeg (kernel_size = 10, sigma_low_pass = 1, sigma_high_pass = 10, ratio = 1, see code for detailed operation).
For Derek & Nutmeg, I also show below original, filtered, and hybrid image the fourier domain.
Next is the hybrid image between Cooper and Brand from Interstellar (kernel_size = 20, sigma_low_pass = 1, sigma_high_pass = 1, ratio = 0.7). Hang in there, Dr. Brand! Cooper is coming for you!
And next is the hybrid image between "this is fine" and "this is not fine" from my favourite meme (kernel_size = 20, sigma_low_pass = 2.5, sigma_high_pass = 1, ratio = 1.5). I guess it's fine, isn't it? WARNING: CONTENT MIGHT BE DISTURBING TO SOME VIEWERS.
Failed Example: this is because letters does not contain much high frequency information.
I have shown both the greyscale and the colorized version of the hybrid images. We can see that the colorized image is a bit off. This is because in the high frequency regime, the white balancing is very off. For example, the low-pass filtered image of Derek is accurate in terms of white balancing, yet the high-pass filtered image for Nutmeg is very off (most of it is just black, and for the part with color the color is very misaligned with the overall color of the original image). Thus, it works better to use color fo rthe low-frequency component but not the high frequency component.
A Gaussian Stack is arrived by iteratively convolving an image with a Gaussian kernel. A Laplacian Stack is just the difference between two adjacent blurred images in the Gaussian Stack (the final image in the Laplacian Stack is simply copied from the final image in the Gaussian Stack). From first to last, the Laplacian stack's images contain high frequency to low frequency information about the image. By adding everything in the Laplacian Stack, we can reconstruct the original image. Below are the visualization of the 6 levels of the Laplacian Stacks applied on the images Apple and Orange, where kernel_size = 30, sigma = 30.
Applying the Gaussian Stack of a binary mask onto the laplacian stack of two images (and afterwards adding the laplacians up), we can use multiresolution blending to blend the two images, resulting in a smooth transition. The pipeline is visualized below. For Oraple, a more detailed illustration figure is also included. We observe that using color is fine and makes the image look better (bells & whistles).
Next is my icecream and Remy from Ratatouille.
Next is the Pizza from Lucia's Berkeley and Pizzeria da Laura (I also recommend Rose Pizzeria!). Who says there can only be one topping style?
Finally it's me and my robot friend! Plenty of slaves for my robot colony (quote from Interstellar)!