1. Image formation
1.1 针孔照相机

1.2 投影属性

多对一
点对点
线对线
平面对平面
焦平面:过第一焦点(前焦点或物方焦点)且垂直于系统主光轴的平面称第一焦平面
1.3 透视投影

同理可得x/x'=z/f'
1.4 透视投影矩阵

因为三维图像在二维面上投影,所以不需要z坐标
1.5 坐标系统

世界坐标系是客观存在的,标准的右手坐标系,中指z轴,大拇指x轴,食指y轴。
相比之下相机坐标系可以任意变形。
1.6 相机旋转和平移


X^W是什么齐次世界坐标 X^W有波浪线的是非齐次世界坐标
1.7 坐标系统之Extrinsics

1.8 相机坐标系统

主轴:从相机中心(C点)出发,垂直于图像平面的线。
规范化的坐标系统:相机中心是原始的,主轴是z轴。
主点:主轴和平面相交的点(P点)。
1.9 主点偏移


1.10 回顾:针孔相机模型

X^I是图像坐标。
和X^C是相机坐标

K是标定矩阵
1.10 像素坐标

1.11 标定矩阵

1.12 坐标系统之Inrinsics

1.13 坐标系统总结

1.14 正交投影

因为z=z'=f'

1.15 相机标定

给定n个点的Xi和xi坐标,判断相机参数。

点积: (x1 , y1 , z1 ) .( x2 , y2 , z2 ) = x1x2 + y1y2 + z1z2
叉乘: ( x1 , y1 , z1 ) X ( x2 , y2 , z2 ) =( y1z2 - z1y2 , z1x2 - x1z2 , x1y2 - y1x2 )
p1^T是1×4矩阵,P矩阵是3×4矩阵,Xi是齐次坐标4×1矩阵

p1是4×1矩阵



1.16 齐次最小平方




2. Cameras filtering
2.1 Shrinking the aperture(光圈)

更少的光线;衍射效果。
2.2 添加透镜



特殊的点发出的光线在焦距上,其他的点投影到散光圈上了。
Thin lens formula:

运用两次相似三角形可推出。

上张图两式联立即可推出。
2.3 景深

改变光圈大小来改变景深。
2.4 视野


视野(Field of View)依赖于焦距和camera retina

色差:

透镜对不等的波长有不同的折射率->导致彩色波纹
球面像差:

离光轴越远的光线聚焦地更近。
这样会导致图像中心边缘模糊。
光晕/光线损失

Radial Distortion(径向畸变):

理想化的空间采样:

2.5 CCD和CMOS的区别:

CCD从像素移动照片生成的电荷到像素并且将其转成电压在一个输出结点上;
CMOS成像器将电荷在每个像素内转换成电压。
Slides36第一行是CCD吗
颜色感知:

2.6 彩色摩尔纹的成因

如果感光元件CCD(CMOS)像素的空间频率与影像中条纹的空间频率接近,就会产生摩尔纹
2.7 数字图像处理

2.8 图像可看作函数

[a,b]×[c,d]表示笛卡尔乘积
最后映射成[0,m]^3
2.9 Filter
线性运算:

卷积:

卷积的计算规则:

相加噪声模型:


Average Filter:

Gaussian Averaging(高斯平滑):

实现:

Write the kernels as [f(0,0), f(0,1), f(0,2)] and [h(0,0), h(1,0), h(2,0)]T and compute the convolution. Note that in this case the definition breaks down to g(i,j) = f(0,j)h(i,0).
3.边缘检测
3.1 Binary Images(二进制图像)

3.2 Morphological filters — Side note

3.3 Gaussian pyramids


3.4 Subsampling(降采样)

3.5 Aliasing



3.6 What are “edges” (1D)

3.7 Edges

找到导数值很大的地方
3.8 Edges & Derivatives

找到一阶导数取得极值的地方,即二阶导的零点。因为一阶导数极值点(二阶导数的零点)是像素变化最剧烈的地方。什么地方变化最剧烈?是边缘。
3.9 Compute Derivatives

1 -1和1 0 -1是f(x+h),f(x),f(x-h)的系数。
3.10 Edge Detection
based on 1st derivative:

Simplification:

3.11 Image scanline


3.12 Implementing 1D edge detection

3.13 Extension to 2D Edge Detection: Partial Derivatives


Sobel filters (Gaussian smoothing in opposite direction)
3.14 Again: Derivatives and Smoothing

3.15 What is the gradient?



梯度方向垂直于边缘,梯度幅值描述了边缘强度
3.16 2D Edge Detection


滤波的维度会影响导数估计和边缘语义的恢复
3.17 ‘Optimal’ Edge Detection: Canny

3.18 The Canny edge detector

3.19 Non-maximum suppression

找到本地最大值(loacal maximum),即是“真正的边缘”
3.20 Threshhold的好处

3.21 Hysteresis

标记如下像素作为边缘
*1 使梯度幅度超过第二个较低的阈值
*2 连接到高于较高阈值的像素
3.22 Edges & Derivatives…

3.23 Compute 2nd order derivatives

3.24 The Laplacian

3.25 Second Derivative of Gaussian

3.26 1D edge detection

3.27 Approximating the Laplacian

3.28 Edge Detection with Laplacian

3.29 Laplacian pyramid


4. PCA
4.1 Images as Vectors

4.2 Images as Points

n*m可以理解为二维灰度图行列合并成一列。
4.3 Template Matching

4.4 Dot Product

4.5 SSD(Sum of Squared Differences) Matching

4.6 Subspace Methods

4.7 Linear Dimensionality Reduction


4.8 Goal


4.9 Principal Component Analysis

4.10 Decomposition

||||表示Frobenius 范数。参考:(https://www.cnblogs.com/lpgit/p/9734701.html#:~:text=Frobenius 范数,简称F,记为||·||F。&text=可用于利用低秩,数尽可能地小。)
4.11 Minimizing the Error

( ilde{x})是测量值

使误差最小相当于使测量值的方差最大。
解释如下:
Along the principal components, which are an orthonormal eigenbasis of the covariance matrix, the variance is equal to the corresponding eigenvalue of the covariance matrix.
So, the total variance of the PCA projections with dimension k is equal to the trace (sum of eigenvalues) of the covariance matrix of the PCA projections, which, if you pick the right principle components according to the highest eigenvalues, equals the sum of the top k eigenvalues of the original covariance matrix.
简单地说,PCA投影的方差和该矩阵的迹(对角线数之和)相等。

(ar{x})是0
4.12 Formulation of problem

4.13 Mean & Variance


注意这里的协方差矩阵

4.14 Maximizing variance

拉格朗日乘数法求最值
why is the largest eigenvalue = maximal variance?
4.15 Reminder:Eigendecomposition


4.16 Principal Component Analysis


4.17 Choosing D

4.18 Rewrite PCA

4.19 Singular Value Decomposition

4.20 How to use SVD for PCA


4.21 EigenFaces

4.22 Face Recognition

4.23 Intra- & Extra-Personal Subspaces

4.24 EigenFeatures


4.25 Naïve View-Based Approach

4.26 View-Based Approach


4.27 Mouth Space



4.28 Discriminative enough?


4.29 Mouth Space

4.30 Simple Search Strategy

5. multivariate
5.1 Statistics Review: Univariate


方差和样本方差
5.2 Statistics Review: Multivariate


协方差定义为

推导过程如下

5.3 Statistical Correlation

5.4 Reminder: Covariance Matrix

5.5 Recall: Outer Product

5.6 Correlated?



5.7 Recall: Basis Representations

在数学中,克罗内克函数(又称克罗内克δ函数、克罗内克δ)δ(_{ij}) 是一个二元函数,得名于德国数学家利奥波德·克罗内克。克罗内克函数的自变量(输入值)一般是两个整数,如果两者相等,则其输出值为1,否则为0。

ajuj是从aiui的求和中提取出来的。
6. interest points
6.1 Finding Corresponding Points

6.2 Interest point detection

6.3 Interest point detection: Derivation

6.4 Can we approximate this?

6.5 Interest point detection: Derivation

u和v分别为向右和向上的平移量。





6.6 The Harris operator

(lambda_{1})和(lambda_{2})是二阶矩矩阵化简后的对角线上的元素。



6.7 Harris detector example

6.8 f value (red high, blue low)

6.9 Threshold (f > value)

6.10 Harris interest points

6.11 Other interest point detectors

6.12 Hessian interest points

6.13 Geometric Transformations

6.14 Rotation Invariance

6.15 Automatic scale selection

6.16 Scale Space

6.17 Automatic scale selection

6.18 Scale invariance – Normalization


6.19 Laplacian of Gaussian (LoG)


6.20 Harris-Laplace (HarLap)




6.21 Other popular interest points

7. Matching single view
7.1 Evaluation




7.2 Local Descriptors / Features

7.3 Local Descriptors
- Distinctiveness:

- Invariance:

- Robustness:

总结:



7.4 Scale Invariant Feature Transform (SIFT)

7.5 SIFT Descriptor

7.6 Properties of SIFT

7.7 Shape Context

7.8 Evaluating the results

7.9 True/false positives

7.10 Evaluating the results


7.11 Projecting a planar object

7.12 Geometric intuition

7.13 Projecting a plane


7.14 Rotating camera

7.15 Estimating the Homography


2N×9矩阵和9×1矩阵推导
https://blog.csdn.net/lyhbkz/article/details/82254893



对u进行SVD,即u = (U Sigma V^T), (V^T)即是(ar{H})
7.16 Robust Estimation

8. two view
8.1 Robust Estimation




表格白色中的数值代表迭代次数
8.2 Estimating the homography

8.3 Application: AutoStitch




8.4 Next goal: Recovery of 3D structure


8.5 What is Stereo (Vision)?

8.6 Depth from Stereo


8.7 Triangulation



8.8 Triangulation: Geometric midpoint

8.9 Triangulation: Linear approach

8.10 Side note

8.11 Triangulation: non-linear approach


8.12 Epipolar Geometry






8.13 Example: Converging Cameras

8.14 Example: Motion Parallel to Image Plane

8.13 Epipolar Geometry


t是平移矩阵,R是旋转矩阵。
(O_{1}O_{2})是t,表示(O_1)经过平移成为(O_2)
(O_{2}p_{2})是(Rp_{2})相当于(O_{1}p_{2})旋转后成为(O_{2}p_{2})


9. stereo
9.1 Binocular Stereo: Parallel cameras

t( imes)R的结果推导:
计算叉乘矩阵:

9.2 Uncalibrated Cameras



9.3 Fundamental matrix


9.4 Estimating the fundamental matrix

9.5 Eight-point algorithm



9.6 Estimating the fundamental matrix

9.7 Example





9.8 What can we do with 2 views?



9.9 Stereo Vision — Easier problem

9.10 Binocular Stereo

9.11 Binocular Stereo — Disparity








9.12 Triangulation

x是左边O的坐标系
(x^prime)是右边(O^prime)的坐标系
9.13 Stereo rectification



9.14 Stereo Correspondence

9.15 Correspondence problem


9.16 Normalized Correlation

9.17 Even simpler: Sum of Squared (Pixel) Differences

9.18 Window-based matching

9.19 Convert disparity to depth

9.20 Window-based matching

9.21 Influence of window size

9.22 The similarity constraint

9.23 Limitations of similarity constraint

10. Motion
10.1 Non-local constraints




10.2 Scanline stereo

10.3 "Shortest path" for scanline stereo

10.4 Coherent stereo on 2D grid

10.5 Motion Field

10.6 Optical Flow

10.7 Optical Flow Field

10.8 Optical Flow Estimation

10.9 How do we compute Optical Flow?

10.10 Brightness Constancy


10.11 Spatial Coherence

10.12 Minimize Brightness Difference

10.13 Sum of Squared Differences

10.14 Simple Flow Estimation Algorithm

10.15 Can we approximate this?


10.16 Optical Flow Constraint Equation


10.17 Notation

10.18 OFCE

10.19 Aperture Problem







10.20 Multiple Constraints


10.21 Area-Based Flow

10.22 Optimization




10.23 Structure tensor

10.24 Solving for u

u是((u,v)^{t})向量
10.25 SSD Surface - Textured area

10.26 SSD Surface - Single Edge

10.27 SSD Surface - Surface Boundary

11. Recognition
11.1 Image registration

11.2 Image Warping

11.3 Forward Warping


11.4 Inverse Warping


11.5 Interpolation

11.6 Warping

11.7 Image registration revisited

11.8 Dense LK Flow

11.9 Iterative Estimation

11.10 Iterative Estimation





11.11 Coarse-to-fine Estimation

11.12 Is that a good result?

11.13 Aside: How to get the ground truth?

11.14 What is the problem?

11.15 Shift Gears: Object Recognition

11.16 Example of Recognition & Localization

11.17 Recognition problems

11.18 Search and Recognition

11.19 Naive View-Based Approach

11.20 From faces and mouths to objects

11.21 Appearance-Based Instance Recognition

11.22 Challenges

11.23 Global Representation

11.24 View-Based Approaches

11.25 Feature representation

11.26 Recognition using Histograms


11.27 Histogram Comparison




11.28 Recognition using Histograms

11.29 Color Histograms

11.30 Discussion: Color Histograms

11.31 Generalization of the Idea


11.32 Multidimensional Histograms

11.33 Recognition Results


11.34 Summary

11.35 “Bag of words” Model

11.36 Analogy to Documents

11.37 Visual word distributions

11.38 Bag of Words

11.39 Bag-of-Words Model: Overview

12. bow
12.1 BoW-1. Feature detection and representation


12.2 BoW-2. Codeword dictionary formation


12.3 Vector quantization

12.4 K-Means




12.5 BoW-2. Codeword dictionary formation

12.6 Image patch examples of codewords

12.5 BoW-3. Image representation

12.6 Next: Actual Recognition

12.7 Excursion into Machine Learning

12.8 Classification vs. Regression


12.9 General Paradigm

12.10 Bayesian Decision Theory







Bayes optimal classifier:

12.11 Discriminative vs. generative

12.12 Discriminant Functions

12.13 Which Hyperplane is Best? and Why?

12.14 Support Vector Machines

12.15 Toward Neural Networks

12.16 Sigmoid

12.17 Multi-Class Network

12.18 Learning

12.19 Gradient Descent

k = l 时消去求和符号,当x = (x_j)时的求导,所以是(x_j)

12.20 Stochastic Gradient Descent

12.21 Gradient Descent

推导过程参见44页。
13 CNN
13.1 Bag-of-Words Model: Overview

13.2 Next: Actual Recognition

13.3 Machine Learning: General Paradigm

13.4 Discriminative vs. generative

13.5 Discriminant Functions

13.6 Toward Neural Networks

13.7 Sigmoid

13.8 Multi-Class Network

13.9 Multi-Class Network for Classification

13.10 Learning

13.11 Gradient Descent






13.12 Stochastic Gradient Descent

13.13 Multi-Layer Perceptron

(z_0 = 1)为偏差


13.14 Learning with Gradient Descent


13.15 Backpropagation Algorithm

13.16 Robot Navigation (a.k.a. “deep driving”)

13.17 Bag-of-Words Model: Done!

13.18 Summary & Discussion

13.19 What about spatial information?

13.20 Problem with bag-of-words

13.21 Model: Parts and Structure

13.22 Part-based Representation

13.23 The correspondence problem

13.24 Pyramid match kernel

13.25 Convolutional Neural Networks

13.26 CNN Architecture

13.27 History: Multistage Hubel-Wiesel Architecture

13.28 Overview of ConvNets

13.29 Krizhevsky et al. [NIPS 2012]

13.30 Using Features on Other Datasets

13.31 Caltech 256


13.32 Components of Each Layer

13.33 Filtering

13.34 Non-Linearity


13.35 Pooling

13.36 Role Of Pooling

13.37 Receptive Field

13.38 Components of Each Layer

13.39 Batch Normalization

13.40 Loss function

13.41 Initialization

13.42 How to Choose Architecture

13.43 AlexNet Architecture





13.44 Tapping off Features at each Layer

13.45 Translation (Vertical)

13.46 Scale Invariance

13.47 Rotation Invariance

13.48 Very Deep Models: VGG

13.49 Very Deep Models: GoogLeNet

13.50 Residual Networks

13.51 Training Big ConvNets

13.52 Annealing of Learning Rate

13.53 Adversarial Examples

13.54 What if it does not work?

13.55 Deep Learning for Computer Vision
