🎨

CS180 Project 1: Colorizing the Prokudin-Gorskii Photo Collection

Author: Nicolas Rault-Wang (nraultwang at berkeley.edu)

Introduction

  • The Prokudin-Gorskii collection consists of blue, green, and red negatives of a subject. For example, we have a triple negative of Emir at right.
  • In this project, we have created a program to automatically produce colorized photos from the triple-negatives in the Prokudin-Gorskii collection.
  • Our procedure has 3-steps, explained further in the following sections.
    1. Extract the individual color channels from the original triple negative.
    1. Align the structures in these negatives to form a coherent colorized photo.
    1. Automatically trim border artifacts from these aligned images.
Triple-negative of Emir

Colorization Procedure

Step 1: Extracting Color Channel Negatives

  • We begin by extracting the blue, red, and green channels from the original triple-negative photo.
  • Each color channel has approximately the same dimensions, so we simply divide the original image into three equally-sized images, vertically padding if the height is not divisible by 3.
  • The rightmost image shows the output of this step.
Triple-negative of Emir

Step 2: Multi-resolution Color Alignment

k=0:R0=1,S0=1k=0:R_0 = 1, S_0 = 1. The algorithm finds a displacement vector r0\vec r_0^* such that I10+r0I_1^0 + \vec r_0^* and I20I_2^0 are well-aligned, according to the NCCS objective.
k=6:R6=64,S6=64k=6:R_6 = 64, S_6 = 64. Lowest-resolution image in the image pyramid. This is the base case for the recursive algorithm.

Problem Description

  • The pixels in the negatives extracted in Step 1 are not necessarily aligned because there are generally variations in their positions and the border widths between them.
  • Aligning the negatives is equivalent to finding an (x,y) (x,y) displacement for channel so that every pixel position (i,j)(i,j) in each negative corresponds to the same structure.
  • My alignment algorithm uses an image pyramid to efficiently compute these offsets:
    • The algorithm takes two grayscale color negatives I10I_1^0 and I20I_2^0 of the same shape, then recursively computes the displacement vector r0\vec r_0 that maximizes the objective NCCS(I10+r0,I20)NCCS(I_1^0 + \vec r_0, I_2^0). (NCCS is defined below.)
    • To visualize this procedure, the surrounding figures show the optimal alignments at each level of the image pyramid

Alignment Algorithm

  1. Set R0=1R_0 = 1 and S0=1S_0 = 1, where RkR_k is the downsampling factor and SkS_k is the maximum displacement in xx or yy.
  1. For k0k\geq 0, downsample each of I10I_1^0 and I20I_2^0 by RkR_k to obtain I1kI_1^{k} and I2kI_2^k, respectively.
  1. If the largest dimension of I1kI_1^k exceeds 64, make a recursive call on I1kI_1^{k} and I2kI_2^{k} with resolution Rk+12RkR_{k+1} \equiv 2R_k and search radius Sk+12SkS_{k+1} \equiv 2S_k.
    1. This recursive call will return a displacement vector rk+1\vec r_{k+1}^* such that I1k+1+rk+1I_1^{k+1} + \vec r^*_{k+1} and I2k+1I_2^{k+1} are well aligned.
    1. Note that I1kI_1^k has twice the dimensions of I1k+1I_1^{k+1}, we have that I1k+2rk+1I_1^{k} + 2\vec r^*_{k+1} and I2kI_2^k are approximately well aligned.
  1. Next, exhaustively examine all displacements dk[Sk,Sk]2\vec d_k \in[-S_k, S_k]^2 to find the vector dk\vec d_k^* that best aligns I1k+2rk+1I_1^{k} + 2\vec r^*_{k+1} and I2kI_2^k:
    dk=arg maxdk[Sk,Sk]2NCCS((I1k+2rk+1)+dk, I2k)\vec d_k^* = \argmax_{\vec d_k \in[-S_k, S_k]^2} NCCS\left((I^k_1 + 2\vec r^*_{k+1}) + \vec d_k,\ I_2^k\right)
  1. Finally, return the displacement vector rkdk+2rk+1\vec r_k^* \equiv \vec d_k^*+ 2\vec r_{k+1}^*.

Defining NCCS

  • NCCS: the Normalized Cross-Correlation between two Scharr-filtered, center-cropped grayscale images with the same shape.
NCCS(I1,I2)=Φ(I1)Φ(I1)F, Φ(I2)Φ(I2)FNCCS(I_1, I_2) = \left\langle\frac{\Phi(I_1^\prime)}{\|\Phi(I_1^\prime)\|_F},\ \frac{\Phi(I_2^\prime)}{ \|\Phi(I_2^\prime)\|_F}\right\rangle
  • Motivation for each feature of this metric:
    • NCC: Since different channels have different baselines, it is a good idea to normalize before comparing them. Further, we’d like to ensure the structures between color channels are at the same pixel positions, and maximizing cross-correlation is a good way to do this.
    • Edge detection: Most borders of an object should be present in all colors, so we can align the structures in different color channels by aligning their edges. Edge detection filters like Scharr create sparse patterns in each color channel that have large NCC scores when they are properly aligned. This NCC score rapidly falls off when the images are not well aligned, so filtering our color channels with an edge detector is a good way to help of our alignment procedure converge on a truly optimal displacement vector.
    • Center-cropping: In my experiments, I found that it’s a lot easier to remove the image borders after alignment, so all the borders are still present during the alignment procedure. Since my metric relies on matching the edge-detections between color channels and the border-image transition is very sharp, my algorithm would often find strange alignments that tended to maximize the border-edge alignments between color channels, resulting in improperly aligned channels. A simple solution to this problem was excluding the edges from the metric calculations with a center crop.
  • Notation
    • I1I_1: grayscale image.
    • I1I_1^\prime: center-cropped I1I_1.
    • Φ()\Phi(\cdot): Scharr edge detection.
k=5:R5=32,S5=32k=5:R_5 = 32, S_5 = 32.
k=4:R4=16,S4=16k=4:R_4 = 16, S_4 = 16
k=3:R3=8,S3=8k=3:R_3 = 8, S_3 = 8
k=2:R2=4,S2=4k=2:R_2 = 4, S_2 = 4
k=1:R1=2,S1=2k=1:R_1 = 2, S_1 = 2

Step 3: Automatic Border Trimming

Problem Description

Left: Colorized photo obtained by stacking output of Step 2; Right: The output of Step 3.

Overview of Solution

Left: Colorized photo before border-trim; Right: Cropping solution returned by the first pass of the algorithm.

Cropping Rectangle Algorithm

  1. Given an aligned negative II from Step 2 with dimensions M×NM\times N, create the following slices of the borders of II.
    1. Top border: I[:M/10, :]
    1. Bottom border: I[9M/10:, :]
    1. Left border: I[:, :N/10]
    1. Right border: I[:, 9N/10:]
  1. Apply the Canny edge detector with σ=5\sigma=5 to each of these slices.
  1. Compute the median μ\mu and standard deviation σ\sigma of the distribution of edge-detection activations.
  1. Threshold the edge detections by setting any detection with Z-score less than 2 to 0.
  1. Find the column and row indices with the largest above-threshold activations, thus obtaining the top left (r0,c0)(r_0, c_0) and bottom right (r1,c1)(r_1, c_1) coordinates of the cropping rectangle.
Left- and right-border slices after step 4. Red lines indicate the sides of the cropping rectangle found in step 5.

N-pass Border-Trimming Algorithm

  1. Obtain the aligned color channels from Step 2: Multi-resolution Color Alignment.
  1. Run the Cropping Rectangle Algorithm on the three color channels, thus obtaining three cropping rectangles.
  1. Compute the average cropping rectangle by averaging the top left and bottom right pixel coordinates of these rectangles.
  1. Center crop the colorized photo to the average cropping rectangle.
  1. Repeat 2-4, feeding the output of 4 into 2, until a total of NN passes have been completed.
    1. The figures below show the cropping procedure with N=2N=2 passes applied to the train.tif photo.
Pass 1: Top and bottom borders.
Pass 1: Left and right borders.
Pass 2: Top and bottom borders.
Pass 2: Left and right borders.

Comparisons: Before and After Applying 2-pass Border-Trimming

Results of 3-step Colorizing Procedure

Colorized Photos

train.tif
lady.tif
cathedral.jpg
emir.tif
melons.tif
onion_church.tif
monastery.jpg
church.tif
three_generations.tif
harvesters.tif
icon.tif
self_portrait.tif
sculpture.tif
tobolsk.jpg

Optimal Alignment Vectors
offset_vectors for emir.tif:
	r: [ 40 106]
	g: [24 48]
	b: [0 0]

offset_vectors for monastery.jpg:
	r: [2 2]
	g: [ 2 -4]
	b: [0 0]

offset_vectors for church.tif:
	r: [-4 58]
	g: [ 4 24]
	b: [0 0]

offset_vectors for three_generations.tif:
	r: [ 10 110]
	g: [13 52]
	b: [0 0]

offset_vectors for melons.tif:
	r: [ 13 177]
	g: [10 81]
	b: [0 0]

offset_vectors for onion_church.tif:
	r: [ 36 108]
	g: [27 50]
	b: [0 0]

offset_vectors for train.tif:
	r: [32 86]
	g: [ 7 42]
	b: [0 0]

offset_vectors for tobolsk.jpg:
	r: [3 6]
	g: [2 2]
	b: [0 0]

offset_vectors for icon.tif:
	r: [22 90]
	g: [18 41]
	b: [0 0]

offset_vectors for cathedral.jpg:
	r: [ 3 12]
	g: [2 4]
	b: [0 0]

offset_vectors for self_portrait.tif:
	r: [ 37 175]
	g: [29 78]
	b: [0 0]

offset_vectors for harvesters.tif:
	r: [ 14 123]
	g: [18 60]
	b: [0 0]

offset_vectors for sculpture.tif:
	r: [-28 140]
	g: [-12  33]
	b: [0 0]

offset_vectors for lady.tif:
	r: [ 12 115]
	g: [ 8 54]
	b: [0 0]

Credit to Notion for this template.