Fun with Filters and Frequencies!

UC Berkeley
CS 180 Fall 2024 Project 2

Project Overview

In this project I implemented various filters, hybrid images, gaussian laplacian stacks, and multi-resolution blending.

1.1: Finite Difference Operator

I convolved the original cameraman greyscal image with the finite difference operators to get the derivatives, calculated their magnitude (L2 norm), and used NCC score to find the best cutoff value such that the binarized edge image is the most similar to that generated by the Canny Edge Detector. It turns out that the optimal cutoff norm is 0.231 for the image. By default, I use the "same" mode for 2d convolution and use the "symmetric" boundary condition (i.e., padding). Note that convolving with Dx finds vertical edges and convolving with Dy finds horizontal edges. Below are the visualizations of the pipeline.

Cameraman (Original)
Dx Convolved
Dy Convolved
Gradient Magnitude
Binarized Edge Image

1.2: Derivative of Gaussian (DoG) Filter

A two-step approach to reduce the noise in the edge image is to first blur the image through a Gaussian filter (I use kernel size = 5, sigma = 1), and then go through the same procedure as 1.1. As before, I use NCC to find the optimal cutoff, which is 0.093 for the blurred image. Below are the visualizations of the pipeline. We can see that with the blurring, there is much less noise in the final edge image (probably because the noise are caused by high frequency components of the image, which are filtered out by the Gaussian kernel).

Cameraman (blured)
Dx Convolved (After Blurring)
Dy Convolved (After Blurring)
Gradient Magnitude (DoG two-step)
Binarized Edge Image (DoG two-step)

Alternatively, we can retrieve the DoG kernel directly by convolving the Gaussian kernel with Dx and Dy, respectively. Below are results from this single-step fashioned approach. Note that the results are very similar to the two-step approach (although there are very minor differences due to boundaries). I use 0.093 as the cutoff norm value.

DoG Dx Convolved
DoG Dy Convolved
Gradient Magnitude (DoG one-step)
Binarized Edge Image (DoG one-step)

2.1: Image "Sharpening"

Let I be the original image and let G be I convolved with a 2D Gaussian Filter (again, I choose kernel_size = 5, sigma = 1). Then, the sharpened image is I + a * (I - G). This is equivalend to convolving with -a * KG + (1 + a) * I, where KG is the Gaussian Kernel and I is the identity kernel (only the center is 1, everywhere else is 0). Here I choose a = 1. Below are the visualizations of the pipeline and the comparison between the original and the sharpened image.

Taj (Original)
Taj (Sharpened)
Absolute Difference
Bench (Original)
Bench (Sharpened)
Absolute Difference

For the next image, I first blur it with a Gaussian filter (kernel_size = 5, sigma = 1) and then resharpen it with a = 1. Note that the resharpened image is close but still a bit different from the original image (since blurring it loss high frequency information). The Absolute Difference image shows the difference between the original and the resharpened image.

Shore (Original)
Shore (Blurred)
Shore (Resharpened)
Absolute Difference

2.2: Hybrid Images (with Bells and Whistles)

After aligning a pair of images by matching two points on each, I use a Gaussian filter to extract the low frequency components of the first image and the high frequency components of the second image (by subtracting the original image with the low pass filtered, i.e., Gaussian filtered, version). Afterwards I normalize the low & high frequency images and add them together (weighing them differently using a tubale ratio). Below are the visualizations of the pipeline for Derek $ Nutmeg (kernel_size = 10, sigma_low_pass = 1, sigma_high_pass = 10, ratio = 1, see code for detailed operation).

Derek
Nutmeg
Hybrid (Greyscale)
Hybrid (Colorized)

For Derek & Nutmeg, I also show below original, filtered, and hybrid image the fourier domain.

Original
Filtered (Derek is low-pass filtered and Nutmeg is high-pass filtered)
Hybrid


Next is the hybrid image between Cooper and Brand from Interstellar (kernel_size = 20, sigma_low_pass = 1, sigma_high_pass = 1, ratio = 0.7). Hang in there, Dr. Brand! Cooper is coming for you!

Cooper
Brand
Hybrid (Greyscale)
Hybrid (Colorized)


And next is the hybrid image between "this is fine" and "this is not fine" from my favourite meme (kernel_size = 20, sigma_low_pass = 2.5, sigma_high_pass = 1, ratio = 1.5). I guess it's fine, isn't it? WARNING: CONTENT MIGHT BE DISTURBING TO SOME VIEWERS.

Fine
Not Fine
Hybrid (Greyscale)
Hybrid (Colorized)


Failed Example: this is because letters does not contain much high frequency information.

Love & Peace
Peace & Love
Failure


I have shown both the greyscale and the colorized version of the hybrid images. We can see that the colorized image is a bit off. This is because in the high frequency regime, the white balancing is very off. For example, the low-pass filtered image of Derek is accurate in terms of white balancing, yet the high-pass filtered image for Nutmeg is very off (most of it is just black, and for the part with color the color is very misaligned with the overall color of the original image). Thus, it works better to use color fo rthe low-frequency component but not the high frequency component.

Colorized Low-pass Filtered Derek
Colorized High-pass Filtered Nutmeg

2.3: Gaussian and Laplacian Stacks

A Gaussian Stack is arrived by iteratively convolving an image with a Gaussian kernel. A Laplacian Stack is just the difference between two adjacent blurred images in the Gaussian Stack (the final image in the Laplacian Stack is simply copied from the final image in the Gaussian Stack). From first to last, the Laplacian stack's images contain high frequency to low frequency information about the image. By adding everything in the Laplacian Stack, we can reconstruct the original image. Below are the visualization of the 6 levels of the Laplacian Stacks applied on the images Apple and Orange, where kernel_size = 30, sigma = 30.

Apple (Original)
Orange (Original)
Laplacian Pyramid, from left to right are level 0 through 5, up is Apple, low is Orange

2.4: Multiresolution Blending (with Bells and Whistles)

Applying the Gaussian Stack of a binary mask onto the laplacian stack of two images (and afterwards adding the laplacians up), we can use multiresolution blending to blend the two images, resulting in a smooth transition. The pipeline is visualized below. For Oraple, a more detailed illustration figure is also included. We observe that using color is fine and makes the image look better (bells & whistles).

Oraple (Blended Image; stack_depth = 30; for images kernel_size = 10, sigma = 50; for the mask kernel_size = 30, sigma = 100)
Illustration of Oraple; left to right: first 6 images in the Laplacian (for Apple and Orange) or Gaussian (for the binary mask) stacks; up is filter, middle is Apple, low is Orange, lowest is the blended image at that level. The mask is a simple vertical, 50-50 binary mask


Next is my icecream and Remy from Ratatouille.

Remy
Icecream
Mask
Remycream (Blended Image; stack_depth = 30; for images kernel_size = 10, sigma = 50; for the mask kernel_size = 30, sigma = 200)


Next is the Pizza from Lucia's Berkeley and Pizzeria da Laura (I also recommend Rose Pizzeria!). Who says there can only be one topping style?

Pizza from Pizzeria da Laura
Pizza from Lucia's Berkeley
Mask
Pizza Blended (Blended Image; stack_depth = 30; for images kernel_size = 10, sigma = 50; for the mask kernel_size = 50, sigma = 200)


Finally it's me and my robot friend! Plenty of slaves for my robot colony (quote from Interstellar)!

Me at Cory 105, Cut Out
Me with My Robot Friend
Mask
I'm the Machine Spirit, Praise the Omnissiah (Blended Image; stack_depth = 30; for images kernel_size = 10, sigma = 50; for the mask kernel_size = 30, sigma = 200)