Skip to content
Snippets Groups Projects
Commit c99b9332 authored by chrisjuniorli's avatar chrisjuniorli
Browse files

MP1 Release

parent f847dc30
No related branches found
No related tags found
No related merge requests found
## ECE549 / CS543 Computer Vision, Spring 2023, Assignment 1
### Instructions
1. Assignment is due at **11:59:59 PM on Wednesday Feb 15, 2023**.
2. See [policies](http://saurabhg.web.illinois.edu/teaching/ece549/sp2023/policies.html)
on [class website](http://saurabhg.web.illinois.edu/teaching/ece549/sp2023/).
3. Submission instructions:
1. A single `.pdf` report that contains your work for Q1, Q2, and Q3. You
can either type out your responses in LaTeX, or any other word processing
software. You can also type your work in Markdown using the template file
in `README.md`. In that case, please enter your work under the `Answer:`
section of each problem. You can also hand write them on a tablet, or scan
in hand-written answers. If you hand-write, please make sure they are neat
and legible. If you are scanning, make sure that the scans are legible.
Lastly, convert your work into a `PDF`.
PDF file should be submitted to [Gradescope](https://www.gradescope.com)
under `MP1`. Course code is **3J2KWY**. Please make sure you tag the
reponses in your PDF with the Gradescope questions outline as described in
[Submitting an Assignment](https://youtu.be/u-pK4GzpId0).
2. For Q4 we are using **gradescope autograder** for testing your code.
For this to work, you will need to submit the code according to the
following instructions:
- Code should be submitted to [Gradescope](https://www.gradescope.com)
under `MP1-code`, you will need to submit the *python* file: `render_image.py`.
- Do not compress the files into `.zip` as this will not work.
- Do not change the provided files names nor the names of the functions but
rather change the code inside the provided functions and add new functions.
Also, make sure that the inputs and outputs of the provided functions are
not changed.
- The autograder will give you feedback on how well your code did.
- The autograder is configured with the following python libraries only:
- numpy
- matplotlib
3. You will also submit code for Q3. Include it as `.py` files in the code
you upload to `MP1-code`. This code won't be autograded but failure to
include this code will lead to zero credit on Q3.
4. We reserve the right to take off points for not following
submission instructions. In particular, please tag the responses in your
PDF with the Gradescope questions outline as described in
[Submitting an Assignment](https://youtu.be/u-pK4GzpId0)).
4. Lastly, be careful not to work of a public fork of this repo. Make a
private clone to work on your assignment. You are responsible for
preventing other students from copying your work. Please also see point 2
above.
### Problems
1. **Comparing Heights [7 pts]**
Consider the image of person A and person B standing on the ground plane as
captured by a perspective camera of focal length $f$. $H$ is the horizon
line ($y=0$). $a_1$ is the distance between A's toes and the horizon, and
$a_2$ is the distance between A's head and the horizon in the unit of
pixels. Similarly for the person B. Suppose A's height is $h$ feet.
<div align="center">
<br/>
<br/>
<img src="./q1/height.png" width="30%">
<br/>
<br/>
</div>
1. **[1 pt]** From the picture, is it possible to determine who is taller in the real world? If so, who is taller A or B? Show your work.
*Answer*:
2. **[6 pts]** Give expressions for the following quantities in terms of $f, a_1, a_2, b_1, b_2, h$. Show your work.
1. **[2 pts]** How many feet above the ground is the camera?
*Answer*:
2. **[2 pts]** What is the height of person B in feet?
*Answer*:
3. **[2 pts]** Distance (along z-axis) from the camera to person B (in feet)?
*Answer*:
2. **Rectangle and Cylinder under Perspective Projection [12 pts]**
1. **[4 pts]** Suppose a rectangle of width $L$ is moving along the X axis.
How does its width $l$ on the image plan change _w.r.t._ its distance $d$ to
the origin? Recall that, under perspective projection a point $(X,Y,Z)$ in
3D space maps to $(f\frac{X}{Z}, f\frac{Y}{Z})$ in the image, where $f$ is
the distance of the image plane from the pinhole.
<div align="center">
<img src="./q2/rectangle.png" width="50%">
</div>
*Answer*:
2. **[8 pts]** What if we replace the rectangle with a cylinder of radius
$r$ on the X axis, how does its width $l$ on the image plane change _w.r.t._
its distance $d$ to the origin? Show your work, and try to simplify the
final result as much as possible. We won't take points off if your answer is
correct and complete, but is only missing algebraic simplifications.
<div align="center">
<img src="./q2/cylinder.png" width="50%">
</div>
*Answer*:
3. **Dynamic Perspective [8 pts]**. In this question, we will simulate optical flow induced on the image of a static scene due to camera motion.
You can review these concepts from [lecture 5](http://saurabhg.web.illinois.edu/teaching/ece549/sp2023/slides/lec05_dynamic_perspective.pdf). For this problem (including both 3.1 and 3.2), you can assume that the X-axis points to the right, Y-axis points down and Z-axis points into the scene.
1. **[4 pts]** Assuming a camera moving with translation velocity of $t$ and angular velocity of $\omega$. Derive the equation that governs the
optical flow at a pixel $(x, y)$ in terms of the focal length $f$, and the depth of the point $Z(x, y)$. Note that a point $(X, Y, Z)$
in the world projects to $(f\frac{X}{Z}, f\frac{Y}{Z})$ in the image.
*Answer*:
2. **[4 pts]** Next, we will try to visualize the optical flow induced on
the image of a static scene due to camera motion. We will build off
starter code in [dynamic_perspective_starter.py](./q3/dynamic_perspective_starter.py).
We have implemented two simple scenes, that of a vertical wall directly
in front of the camera (`get_wall_z_image`), and that of a camera overlooking
a horizontal plane (`get_road_z_image`). Your task is to use these scenes
to visualize the induced optical flow for the following situations:
1. Looking forward on a horizontal plane while driving on a flat road.
2. Sitting in a train and looking out over a flat field from a side window.
3. Flying into a wall head-on.
4. Flying into a wall but also translating horizontally, and vertically.
5. Rotating in front of a wall about the Y-axis.
You should pick the appropriate scene, $t$ and $w$, and visualize the
induced optical flow (you can use the `plot_optical_flow` function). Note
that the origin (for the perspective projection equations) is at the center
of the camera, and this is different from the origin for images in numpy.
As an example, we have provided the output for the first case.
1. Looking forward on a horizontal plane while driving on a flat road.
<div align="center">
<img src="./q3/q1_ans.png" width="50%">
</div>
Include the generated optical flow for the remaining cases below.
2. **[1 pts]** Sitting in a train and looking out over a flat field from a side window.
*Answer*:
3. **[1 pts]** Flying into a wall head-on.
*Answer*:
4. **[1 pts]** Flying into a wall but also translating horizontally, and vertically.
*Answer*:
5. **[1 pts]** Rotating in front of a wall about the Y-axis.
*Answer*:
4. **Phong Shading Model [16 pts]**. In this problem, we will take a closer
look at different types of surfaces and their appearance under varying
lighting and viewing conditions. We will work with the
ambient + lambertian + specular model for image formation (see Section 2.2,
Equation 2.94 in [Szeliski](https://szeliski.org/Book/). In particular,
we will work with the following equation for the intensity at a given pixel $x$,
```math
I(x) = \text{Ambient Term} + \text{Diffuse Term} + \text{Specular Term} \\
I(x) = k_a L_a + k_d \sum_i L_i [\hat{v}_i \cdot \hat{n}]^{+} + k_s \sum_i L_i ([\hat{v}_r \cdot \hat{s}_i]^{+})^{k_e}
```
Here,
- The ambient term, is simply the ambient reflection coefficient, $k_a$, times the ambient light, $L_a$.
- The diffuse term, assumes that the surface is lambertian, that is, it reflects incoming light, $L_i$ multiplied by the diffuse reflection coefficient $k_d$, equally in all directions. However, we need to pay attention to the amount of light that is coming in. It depends on the angle at which light is incident onto the surface. It is given by the dot product $\hat{v}_i \cdot \hat{n}$ between the surface normal at the point $\hat{n}$, and the direction from which light is incident $\hat{v}_i$. $[\cdot]^{+}$ denotes the $\max$ with $0$.
- For the specular term, the light gets reflected preferentially in directions close to the actual direction of reflection. In particular, we will use a dependence of the form $([\hat{v}_r \cdot \hat{s}_i]^{+})^{k_e}$, where $\hat{s}_i$ is the direction of reflection, $\hat{v}_r$ is the viewing direction, and $k_e$ is the shininess coefficient.
- Vectors $\hat{n}$, $\hat{v}_i$, $\hat{v}_r$ and $\hat{s}_i$ are illustrated below for reference
<div align="center">
<img src="./q4/vector.png" width="50%">
</div>
- We are going to ignore shadows and inter-reflections:
- As long as the surface is facing the light source, we will assume that the surface will receive light from the light source.
- A surface only receives light directly from the point / directional light sources, or the ambient light.
- Lastly, we are also going to ignore the $1/r^2$ attenuation for point light sources.
As part of this problem, we will simulate these three terms and use it to render a simple scene. We will provide the per-pixel scene depth, surface normal, and the different coefficients $k_a$, $k_d (=k_a)$ and $k_s$; as well as the strength and locations of the various lights in the scene. Your task is to compute the image based on this information using the Phong Shading model described above.
We have provided a scene with a sphere in front of a wall. You can access this scene using the `get_ball` function from the file [generate_scene.py](./q4/generate_scene.py). It returns the per-pixel depth, surface normal and $k_a$, $k_d$ and $k_s$ for the scene, as visualized below (you can assume a reasonable value for $k_e$ (say 50)):
<div align="center">
<img src="./q4/input.png" width="90%">
</div>
We have also provided some starter code in [render_image.py](./q4/render_image.py) that sets up the different test cases (positions of lights). Your task is to fill in the `render` function that implements the Phong shading model as described above. An example rendering that your code will produce is shown below.
<div align="center">
<img src="./q4/output.png" width="25%">
</div>
MP1/q1/height.png

104 KiB

MP1/q2/cylinder.png

149 KiB

MP1/q2/rectangle.png

135 KiB

import numpy as np
import matplotlib.pyplot as plt
def get_wall_z_image(Z_val, fx, fy, cx, cy, szx, szy):
Z = Z_val*np.ones((szy, szx), dtype=np.float32)
return Z
def get_road_z_image(H_val, fx, fy, cx, cy, szx, szy):
y = np.arange(szy).reshape(-1,1)*1.
y = np.tile(y, (1, szx))
Z = np.zeros((szy, szx), dtype=np.float32)
Z[y > cy] = H_val*fy / (y[y>cy]-cy)
Z[y <= cy] = np.NaN
return Z
def plot_optical_flow(ax, Z, u, v, cx, cy, szx, szy, s=16):
# Here is a function for plotting the optical flow. Feel free to modify this
# function to work well with your inputs, for example if your predictions are
# in a different coordinate frame, etc.
x, y = np.meshgrid(np.arange(szx), np.arange(szy))
ax.imshow(Z, alpha=0.5, origin='upper')
# ax.quiver is to used to plot a 2D field of arrows.
# please check https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.quiver.html to
# understand the detials of the plotting with ax.quiver
q = ax.quiver(x[::s,::s], y[::s,::s], u[::s,::s], v[::s, ::s], angles='xy')
ax.axvline(cx)
ax.axhline(cy)
ax.set_xlim([0, szx])
ax.set_ylim([szy, 0])
ax.axis('equal')
if __name__ == "__main__":
# Focal length along X and Y axis. In class we assumed the same focal length
# for X and Y axis. but in general they could be different. We are denoting
# these by fx and fy, and assume that they are the same for the purpose of
# this MP.
fx = fy = 128.
# Size of the image
szy = 256
szx = 384
# Center of the image. We are going to assume that the principal point is at
# the center of the image.
cx = 192
cy = 128
# Gets the image of a wall 2m in front of the camera.
Z1 = get_wall_z_image(2., fx, fy, cx, cy, szx, szy)
# Gets the image of the ground plane that is 3m below the camera.
Z2 = get_road_z_image(3., fx, fy, cx, cy, szx, szy)
fig, (ax1, ax2) = plt.subplots(1,2, figsize=(14,7))
ax1.imshow(Z1)
ax2.imshow(Z2)
# Plotting function.
f = plt.figure(figsize=(13.5,9))
u = np.random.rand(*Z1.shape)
v = np.random.rand(*Z1.shape)
plot_optical_flow(f.gca(), Z1, u, v, cx, cy, szx, szy, s=16)
f.savefig('optical_flow_output.png', bbox_inches='tight')
MP1/q3/q1_ans.png

68.5 KiB

import numpy as np
import os
import matplotlib.pyplot as plt
import warnings
def z_wall(z=4, f=128, h=256, w=384, cx=192, cy=128):
x, y = np.meshgrid(np.arange(w), np.arange(h))
x = x - cx
y = y - cy
Z = depth = x - x + z
N = np.zeros((h, w, 3), dtype=np.float64)
N[:,:,2] = -1.
A = np.mod(x // 64 + y // 64, 2) * 0.5 + 0.1
A[:] = 0.4
S = A - A
return Z, N, A, S
def sphere(r=1.2, z=2, specular=False, f=128, h=256, w=384, cx=192, cy=128):
# Let's assume that camera is at origin looking in the +Z direction. X axis
# to the right, Y axis downwards.
# Let's assume that the sphere is at (0, 0, z) and of radius r.
x, y = np.meshgrid(np.arange(w), np.arange(h))
x = x - cx
y = y - cy
phi = np.arccos(f / np.sqrt(x**2 + y**2 + f**2))
# equation is rho*2 (1+tan^2(phi)) - 2 rho * z + z*z - r*r = 0
a = np.tan(phi)**2 + 1
b = -2 * z
c = z**2 - r**2
with warnings.catch_warnings():
warnings.simplefilter("ignore")
rho = -b - np.sqrt(b**2 - 4*a*c)
rho = rho / 2 / a
depth = rho
# Calculate the point on the sphere surface
X = x * depth / f
Y = y * depth / f
Z = depth
N = np.array([X, Y, Z])
N = np.transpose(N, [1,2,0])
N = N - np.array([[[0, 0, z]]])
N = N / np.linalg.norm(N, axis=2, keepdims=True)
Z[np.isnan(Z)] = np.inf
# Compute k_d, k_a
with warnings.catch_warnings():
warnings.simplefilter("ignore")
A = (np.sign(X*Y) > 0) * 0.5 + 0.1
# Set specularity
if specular:
S = np.invert(np.isinf(depth)) + 0
else:
S = A-A
return Z, N, A, S
def compose(Z1, N1, A1, S1, Z2, N2, A2, S2):
ind = np.argmin(np.array([Z1, Z2]), axis=0)
Z = Z1 + 0
Z[ind == 1] = Z2[ind == 1]
A = A1 + 0
A[ind == 1] = A2[ind == 1]
S = S1 + 0
S[ind == 1] = S2[ind == 1]
N = N1 + 0
ind = np.repeat(ind[:, :, np.newaxis], 3, axis=2)
N[ind == 1] = N2[ind == 1]
return Z, N, A, S
def get_ball(specular):
Z1, N1, A1, S1 = sphere(specular=specular)
Z2, N2, A2, S2 = z_wall()
Z, N, A, S = compose(Z1, N1, A1, S1, Z2, N2, A2, S2)
return Z, N, A, S
if __name__ == '__main__':
Z1, N1, A1, S1 = sphere(specular=True)
Z2, N2, A2, S2 = z_wall()
Z, N, A, S = compose(Z1, N1, A1, S1, Z2, N2, A2, S2)
fig = plt.figure(constrained_layout=False, figsize=(20, 5))
gs1 = fig.add_gridspec(nrows=1, ncols=4, left=0.05, right=0.95, wspace=0.05, top=0.95, bottom=0.05)
axes = []
for i in range(4):
axes.append(fig.add_subplot(gs1[0,i]))
axes[-1].set_axis_off()
axes[0].imshow(Z, cmap='gray', vmin=0, vmax=5)
axes[0].set_title('Depth, Z')
# visualizing surface normals using a commonly used color map.
N[:,:,0] = -N[:,:,0]
axes[1].imshow((-N[:,:,[0,1,2]]+1)/2.)
axes[1].set_title('Surface normal, N')
axes[2].imshow(A, cmap='gray', vmin=0, vmax=1)
axes[2].set_title('Ambient reflection coefficient, $k_a =$ Diffuse reflection coefficient, $k_d$')
axes[3].imshow(S, cmap='gray', vmin=0, vmax=1)
axes[3].set_title('Specular reflection coefficient, $k_s$')
plt.savefig('input.png', bbox_inches='tight')
MP1/q4/input.png

53.6 KiB

MP1/q4/output.png

16.2 KiB

import numpy as np
from generate_scene import get_ball
import matplotlib.pyplot as plt
# specular exponent
k_e = 50
def render(Z, N, A, S,
point_light_loc, point_light_strength,
directional_light_dirn, directional_light_strength,
ambient_light, k_e):
# To render the images you will need the camera parameters, you can assume
# the following parameters. (cx, cy) denote the center of the image (point
# where the optical axis intersects with the image, f is the focal length.
# These parameters along with the depth image will be useful for you to
# estimate the 3D points on the surface of the sphere for computing the
# angles between the different directions.
h, w = A.shape
cx, cy = w / 2, h /2
f = 128.
# Ambient Term
I = A * ambient_light
# Diffuse Term
# Specular Term
I = np.minimum(I, 1)*255
I = I.astype(np.uint8)
I = np.repeat(I[:,:,np.newaxis], 3, axis=2)
return I
def main():
for specular in [True, False]:
# get_ball function returns:
# - Z (depth image: distance to scene point from camera center, along the
# Z-axis)
# - N is the per pixel surface normals (N[:,:,0] component along X-axis
# (pointing right), N[:,:,1] component along Y-axis (pointing down),
# N[:,:,2] component along Z-axis (pointing into the scene)),
# - A is the per pixel ambient and diffuse reflection coefficient per pixel,
# - S is the per pixel specular reflection coefficient.
Z, N, A, S = get_ball(specular=specular)
# Strength of the ambient light.
ambient_light = 0.5
# For the following code, you can assume that the point sources are located
# at point_light_loc and have a strength of point_light_strength. For the
# directional light sources, you can assume that the light is coming _from_
# the direction indicated by directional_light_dirn (i.e., \hat{v}_i =
# directional_light_dirn), and with strength directional_light_strength.
# The coordinate frame is centered at the camera, X axis points to the
# right, Y-axis point down, and Z-axis points into the scene.
# Case I: No directional light, only point light source that moves around
# the object.
point_light_strength = [1.5]
directional_light_dirn = [[1, 0, 0]]
directional_light_strength = [0.0]
fig, axes = plt.subplots(4, 4, figsize=(15,10))
axes = axes.ravel()[::-1].tolist()
for theta in np.linspace(0, np.pi*2, 16):
point_light_loc = [[10*np.cos(theta), 10*np.sin(theta), -3]]
I = render(Z, N, A, S, point_light_loc, point_light_strength,
directional_light_dirn, directional_light_strength,
ambient_light, k_e)
ax = axes.pop()
ax.imshow(I)
ax.set_axis_off()
plt.savefig(f'specular{specular:d}_move_point.png', bbox_inches='tight')
plt.close()
# Case II: No point source, just a directional light source that moves
# around the object.
point_light_loc = [[0, -10, 2]]
point_light_strength = [0.0]
directional_light_strength = [2.5]
fig, axes = plt.subplots(4, 4, figsize=(15,10))
axes = axes.ravel()[::-1].tolist()
for theta in np.linspace(0, np.pi*2, 16):
directional_light_dirn = [np.array([np.cos(theta), np.sin(theta), .1])]
directional_light_dirn[0] = \
directional_light_dirn[0] / np.linalg.norm(directional_light_dirn[0])
I = render(Z, N, A, S, point_light_loc, point_light_strength,
directional_light_dirn, directional_light_strength,
ambient_light, k_e)
ax = axes.pop()
ax.imshow(I)
ax.set_axis_off()
plt.savefig(f'specular{specular:d}_move_direction.png', bbox_inches='tight')
plt.close()
if __name__ == '__main__':
main()
MP1/q4/vector.png

116 KiB

numpy
matplotlib
## ECE549/CS543 (Computer Vision) ## ECE549/CS543 (Computer Vision)
1. [Programming Assignment 0](./MP0). Due at **11:59:59 PM on Wednesday Feb 1, 2023**. 1. [Programming Assignment 0](./MP0). Due at **11:59:59 PM on Wednesday Feb 1, 2023**.
1. [Programming Assignment 1](./MP1). Due at **11:59:59 PM on Wednesday Feb 15, 2023**.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment