CS180 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Author: Nicolas Rault-Wang (nraultwang at berkeley.edu)

Introduction

The Prokudin-Gorskii collection consists of blue, green, and red negatives of a subject. For example, we have a triple negative of Emir at right.

In this project, we have created a program to automatically produce colorized photos from the triple-negatives in the Prokudin-Gorskii collection.

Our procedure has 3-steps, explained further in the following sections.
1. Extract the individual color channels from the original triple negative.
1. Align the structures in these negatives to form a coherent colorized photo.
1. Automatically trim border artifacts from these aligned images.

Colorization Procedure

Step 1: Extracting Color Channel Negatives

We begin by extracting the blue, red, and green channels from the original triple-negative photo.

Each color channel has approximately the same dimensions, so we simply divide the original image into three equally-sized images, vertically padding if the height is not divisible by 3.

The rightmost image shows the output of this step.

Step 2: Multi-resolution Color Alignment

$k=0:R_0 = 1, S_0 = 1$ . The algorithm finds a displacement vector $\vec r_0^*$ such that $I_1^0 + \vec r_0^*$ and $I_2^0$ are well-aligned, according to the NCCS objective.

$k=6:R_6 = 64, S_6 = 64$ . Lowest-resolution image in the image pyramid. This is the base case for the recursive algorithm.

Problem Description

The pixels in the negatives extracted in Step 1 are not necessarily aligned because there are generally variations in their positions and the border widths between them.

Aligning the negatives is equivalent to finding an $(x,y)$ displacement for channel so that every pixel position $(i,j)$ in each negative corresponds to the same structure.

My alignment algorithm uses an image pyramid to efficiently compute these offsets:
- The algorithm takes two grayscale color negatives $I_1^0$ and $I_2^0$ of the same shape, then recursively computes the displacement vector $\vec r_0$ that maximizes the objective $NCCS(I_1^0 + \vec r_0, I_2^0)$ . (NCCS is defined below.)
- To visualize this procedure, the surrounding figures show the optimal alignments at each level of the image pyramid

Alignment Algorithm

Set $R_0 = 1$ and $S_0 = 1$ , where $R_k$ is the downsampling factor and $S_k$ is the maximum displacement in $x$ or $y$ .

For $k\geq 0$ , downsample each of $I_1^0$ and $I_2^0$ by $R_k$ to obtain $I_1^{k}$ and $I_2^k$ , respectively.

If the largest dimension of $I_1^k$ exceeds 64, make a recursive call on $I_1^{k}$ and $I_2^{k}$ with resolution $R_{k+1} \equiv 2R_k$ and search radius $S_{k+1} \equiv 2S_k$ .
1. This recursive call will return a displacement vector $\vec r_{k+1}^*$ such that $I_1^{k+1} + \vec r^*_{k+1}$ and $I_2^{k+1}$ are well aligned.
1. Note that $I_1^k$ has twice the dimensions of $I_1^{k+1}$ , we have that $I_1^{k} + 2\vec r^*_{k+1}$ and $I_2^k$ are approximately well aligned.

Next, exhaustively examine all displacements $\vec d_k \in[-S_k, S_k]^2$ to find the vector $\vec d_k^*$ that best aligns $I_1^{k} + 2\vec r^*_{k+1}$ and $I_2^k$ :
$\vec d_k^* = \argmax_{\vec d_k \in[-S_k, S_k]^2} NCCS\left((I^k_1 + 2\vec r^*_{k+1}) + \vec d_k,\ I_2^k\right)$

Finally, return the displacement vector $\vec r_k^* \equiv \vec d_k^*+ 2\vec r_{k+1}^*$ .

Defining NCCS

NCCS: the Normalized Cross-Correlation between two Scharr-filtered, center-cropped grayscale images with the same shape.

NCCS(I_1, I_2) = \left\langle\frac{\Phi(I_1^\prime)}{\|\Phi(I_1^\prime)\|_F},\ \frac{\Phi(I_2^\prime)}{ \|\Phi(I_2^\prime)\|_F}\right\rangle

Motivation for each feature of this metric:
- NCC: Since different channels have different baselines, it is a good idea to normalize before comparing them. Further, we’d like to ensure the structures between color channels are at the same pixel positions, and maximizing cross-correlation is a good way to do this.
- Edge detection: Most borders of an object should be present in all colors, so we can align the structures in different color channels by aligning their edges. Edge detection filters like Scharr create sparse patterns in each color channel that have large NCC scores when they are properly aligned. This NCC score rapidly falls off when the images are not well aligned, so filtering our color channels with an edge detector is a good way to help of our alignment procedure converge on a truly optimal displacement vector.
- Center-cropping: In my experiments, I found that it’s a lot easier to remove the image borders after alignment, so all the borders are still present during the alignment procedure. Since my metric relies on matching the edge-detections between color channels and the border-image transition is very sharp, my algorithm would often find strange alignments that tended to maximize the border-edge alignments between color channels, resulting in improperly aligned channels. A simple solution to this problem was excluding the edges from the metric calculations with a center crop.

Notation
- $I_1$ : grayscale image.
- $I_1^\prime$ : center-cropped $I_1$ .
- $\Phi(\cdot)$ : Scharr edge detection.

Step 3: Automatic Border Trimming

Problem Description

Left: Colorized photo obtained by stacking output of Step 2; Right: The output of Step 3.

Using our alignment algorithm, we can create properly-aligned, full-color images (see left, above). Next, we’d like to remove the remaining rectangular black and white borders around these colorized photos (see right, above). In this section we present our method for automatically cropping these borders.

Overview of Solution

Left: Colorized photo before border-trim; Right: Cropping solution returned by the first pass of the algorithm.

To crop the borders, we find the pixel-coordinates of a rectangle that, for each color channel, contains most, if not all, of the non-border pixels while excluding the majority of the border pixels. After finding these coordinates, we crop out the interior of the rectangle (see right, above). We’ll call these rectangles “cropping rectangles” in the following discussion.
- Since the border pixels are mostly very dark or very bright, we can use an edge detector such as the Canny edge detector to find the transitions between border-region pixels and non-border-region pixels.
  - One complication is that the borders for each color channel are slightly different, resulting in slightly different cropping rectangles for every color channel. In my experimentation, I found that cropping the colorized photo with the average of these rectangles works for most images. Though there are some cases where the average rectangle also removes a small portion of the image in addition to the border.

Cropping Rectangle Algorithm

Given an aligned negative $I$ from Step 2 with dimensions $M\times N$ , create the following slices of the borders of $I$ .
1. Top border: I[:M/10, :]
1. Bottom border: I[9M/10:, :]
1. Left border: I[:, :N/10]
1. Right border: I[:, 9N/10:]

Apply the Canny edge detector with $\sigma=5$ to each of these slices.

Compute the median $\mu$ and standard deviation $\sigma$ of the distribution of edge-detection activations.

Threshold the edge detections by setting any detection with Z-score less than 2 to 0.

Find the column and row indices with the largest above-threshold activations, thus obtaining the top left $(r_0, c_0)$ and bottom right $(r_1, c_1)$ coordinates of the cropping rectangle.

Left- and right-border slices after step 4. Red lines indicate the sides of the cropping rectangle found in step 5.

N-pass Border-Trimming Algorithm

Obtain the aligned color channels from Step 2: Multi-resolution Color Alignment.

Run the Cropping Rectangle Algorithm on the three color channels, thus obtaining three cropping rectangles.

Compute the average cropping rectangle by averaging the top left and bottom right pixel coordinates of these rectangles.

Center crop the colorized photo to the average cropping rectangle.

Repeat 2-4, feeding the output of 4 into 2, until a total of $N$ passes have been completed.
1. The figures below show the cropping procedure with $N=2$ passes applied to the train.tif photo.

Comparisons: Before and After Applying 2-pass Border-Trimming

Results of 3-step Colorizing Procedure

Colorized Photos

Optimal Alignment Vectors

offset_vectors for emir.tif:
	r: [ 40 106]
	g: [24 48]
	b: [0 0]

offset_vectors for monastery.jpg:
	r: [2 2]
	g: [ 2 -4]
	b: [0 0]

offset_vectors for church.tif:
	r: [-4 58]
	g: [ 4 24]
	b: [0 0]

offset_vectors for three_generations.tif:
	r: [ 10 110]
	g: [13 52]
	b: [0 0]

offset_vectors for melons.tif:
	r: [ 13 177]
	g: [10 81]
	b: [0 0]

offset_vectors for onion_church.tif:
	r: [ 36 108]
	g: [27 50]
	b: [0 0]

offset_vectors for train.tif:
	r: [32 86]
	g: [ 7 42]
	b: [0 0]

offset_vectors for tobolsk.jpg:
	r: [3 6]
	g: [2 2]
	b: [0 0]

offset_vectors for icon.tif:
	r: [22 90]
	g: [18 41]
	b: [0 0]

offset_vectors for cathedral.jpg:
	r: [ 3 12]
	g: [2 4]
	b: [0 0]

offset_vectors for self_portrait.tif:
	r: [ 37 175]
	g: [29 78]
	b: [0 0]

offset_vectors for harvesters.tif:
	r: [ 14 123]
	g: [18 60]
	b: [0 0]

offset_vectors for sculpture.tif:
	r: [-28 140]
	g: [-12  33]
	b: [0 0]

offset_vectors for lady.tif:
	r: [ 12 115]
	g: [ 8 54]
	b: [0 0]

Credit to Notion for this template.