zoukankan      html  css  js  c++  java
  • Computer Vision Ⅰ学习笔记

    1. Image formation

    1.1 针孔照相机

    1.2 投影属性


    多对一
    点对点
    线对线
    平面对平面
    焦平面:过第一焦点(前焦点或物方焦点)且垂直于系统主光轴的平面称第一焦平面

    1.3 透视投影


    同理可得x/x'=z/f'

    1.4 透视投影矩阵


    因为三维图像在二维面上投影,所以不需要z坐标

    1.5 坐标系统


    世界坐标系是客观存在的,标准的右手坐标系,中指z轴,大拇指x轴,食指y轴。
    相比之下相机坐标系可以任意变形。

    1.6 相机旋转和平移



    X^W是什么齐次世界坐标 X^W有波浪线的是非齐次世界坐标

    1.7 坐标系统之Extrinsics

    1.8 相机坐标系统


    主轴:从相机中心(C点)出发,垂直于图像平面的线。
    规范化的坐标系统:相机中心是原始的,主轴是z轴。
    主点:主轴和平面相交的点(P点)。

    1.9 主点偏移


    1.10 回顾:针孔相机模型


    X^I是图像坐标。
    和X^C是相机坐标

    K是标定矩阵

    1.10 像素坐标

    1.11 标定矩阵

    1.12 坐标系统之Inrinsics

    1.13 坐标系统总结

    1.14 正交投影


    因为z=z'=f'

    1.15 相机标定


    给定n个点的Xi和xi坐标,判断相机参数。

    点积: (x1 , y1 , z1 ) .( x2 , y2 , z2 ) = x1x2 + y1y2 + z1z2
    叉乘: ( x1 , y1 , z1 ) X ( x2 , y2 , z2 ) =( y1z2 - z1y2 , z1x2 - x1z2 , x1y2 - y1x2 )
    p1^T是1×4矩阵,P矩阵是3×4矩阵,Xi是齐次坐标4×1矩阵

    p1是4×1矩阵


    1.16 齐次最小平方




    2. Cameras filtering

    2.1 Shrinking the aperture(光圈)


    更少的光线;衍射效果。

    2.2 添加透镜




    特殊的点发出的光线在焦距上,其他的点投影到散光圈上了。
    Thin lens formula:

    运用两次相似三角形可推出。

    上张图两式联立即可推出。

    2.3 景深


    改变光圈大小来改变景深。

    2.4 视野



    视野(Field of View)依赖于焦距和camera retina

    色差:

    透镜对不等的波长有不同的折射率->导致彩色波纹
    球面像差:

    离光轴越远的光线聚焦地更近。
    这样会导致图像中心边缘模糊。
    光晕/光线损失

    Radial Distortion(径向畸变):

    理想化的空间采样:

    2.5 CCD和CMOS的区别:


    CCD从像素移动照片生成的电荷到像素并且将其转成电压在一个输出结点上;
    CMOS成像器将电荷在每个像素内转换成电压。
    Slides36第一行是CCD吗
    颜色感知:

    2.6 彩色摩尔纹的成因


    如果感光元件CCD(CMOS)像素的空间频率与影像中条纹的空间频率接近,就会产生摩尔纹

    2.7 数字图像处理

    2.8 图像可看作函数


    [a,b]×[c,d]表示笛卡尔乘积
    最后映射成[0,m]^3

    2.9 Filter

    线性运算:

    卷积:

    卷积的计算规则:

    相加噪声模型:


    Average Filter:

    Gaussian Averaging(高斯平滑):

    实现:

    Write the kernels as [f(0,0), f(0,1), f(0,2)] and [h(0,0), h(1,0), h(2,0)]T and compute the convolution. Note that in this case the definition breaks down to g(i,j) = f(0,j)h(i,0).

    3.边缘检测

    3.1 Binary Images(二进制图像)

    3.2 Morphological filters — Side note

    3.3 Gaussian pyramids


    3.4 Subsampling(降采样)

    3.5 Aliasing



    3.6 What are “edges” (1D)

    3.7 Edges


    找到导数值很大的地方

    3.8 Edges & Derivatives


    找到一阶导数取得极值的地方,即二阶导的零点。因为一阶导数极值点(二阶导数的零点)是像素变化最剧烈的地方。什么地方变化最剧烈?是边缘。

    3.9 Compute Derivatives


    1 -1和1 0 -1是f(x+h),f(x),f(x-h)的系数。

    3.10 Edge Detection

    based on 1st derivative:

    Simplification:

    3.11 Image scanline


    3.12 Implementing 1D edge detection

    3.13 Extension to 2D Edge Detection: Partial Derivatives



    Sobel filters (Gaussian smoothing in opposite direction)

    3.14 Again: Derivatives and Smoothing

    3.15 What is the gradient?




    梯度方向垂直于边缘,梯度幅值描述了边缘强度

    3.16 2D Edge Detection



    滤波的维度会影响导数估计和边缘语义的恢复

    3.17 ‘Optimal’ Edge Detection: Canny

    3.18 The Canny edge detector

    3.19 Non-maximum suppression


    找到本地最大值(loacal maximum),即是“真正的边缘”

    3.20 Threshhold的好处

    3.21 Hysteresis


    标记如下像素作为边缘
    *1 使梯度幅度超过第二个较低的阈值
    *2 连接到高于较高阈值的像素

    3.22 Edges & Derivatives…

    3.23 Compute 2nd order derivatives

    3.24 The Laplacian

    3.25 Second Derivative of Gaussian

    3.26 1D edge detection

    3.27 Approximating the Laplacian

    3.28 Edge Detection with Laplacian

    3.29 Laplacian pyramid


    4. PCA

    4.1 Images as Vectors

    4.2 Images as Points


    n*m可以理解为二维灰度图行列合并成一列。

    4.3 Template Matching

    4.4 Dot Product

    4.5 SSD(Sum of Squared Differences) Matching

    4.6 Subspace Methods

    4.7 Linear Dimensionality Reduction


    4.8 Goal


    4.9 Principal Component Analysis

    4.10 Decomposition


    ||||表示Frobenius 范数。参考:(https://www.cnblogs.com/lpgit/p/9734701.html#:~:text=Frobenius 范数,简称F,记为||·||F。&text=可用于利用低秩,数尽可能地小。)

    4.11 Minimizing the Error


    ( ilde{x})是测量值

    使误差最小相当于使测量值的方差最大。
    解释如下:
    Along the principal components, which are an orthonormal eigenbasis of the covariance matrix, the variance is equal to the corresponding eigenvalue of the covariance matrix.
    So, the total variance of the PCA projections with dimension k is equal to the trace (sum of eigenvalues) of the covariance matrix of the PCA projections, which, if you pick the right principle components according to the highest eigenvalues, equals the sum of the top k eigenvalues of the original covariance matrix.
    简单地说,PCA投影的方差和该矩阵的迹(对角线数之和)相等。

    (ar{x})是0

    4.12 Formulation of problem

    4.13 Mean & Variance



    注意这里的协方差矩阵

    4.14 Maximizing variance


    拉格朗日乘数法求最值
    why is the largest eigenvalue = maximal variance?

    4.15 Reminder:Eigendecomposition


    4.16 Principal Component Analysis


    4.17 Choosing D

    4.18 Rewrite PCA

    4.19 Singular Value Decomposition

    4.20 How to use SVD for PCA


    4.21 EigenFaces

    4.22 Face Recognition

    4.23 Intra- & Extra-Personal Subspaces

    4.24 EigenFeatures


    4.25 Naïve View-Based Approach

    4.26 View-Based Approach


    4.27 Mouth Space



    4.28 Discriminative enough?


    4.29 Mouth Space

    4.30 Simple Search Strategy

    5. multivariate

    5.1 Statistics Review: Univariate



    方差和样本方差

    5.2 Statistics Review: Multivariate



    协方差定义为

    推导过程如下

    5.3 Statistical Correlation

    5.4 Reminder: Covariance Matrix

    5.5 Recall: Outer Product

    5.6 Correlated?



    5.7 Recall: Basis Representations


    在数学中,克罗内克函数(又称克罗内克δ函数、克罗内克δ)δ(_{ij}) 是一个二元函数,得名于德国数学家利奥波德·克罗内克。克罗内克函数的自变量(输入值)一般是两个整数,如果两者相等,则其输出值为1,否则为0。

    ajuj是从aiui的求和中提取出来的。

    6. interest points

    6.1 Finding Corresponding Points

    6.2 Interest point detection

    6.3 Interest point detection: Derivation

    6.4 Can we approximate this?

    6.5 Interest point detection: Derivation


    u和v分别为向右和向上的平移量。




    6.6 The Harris operator


    (lambda_{1})(lambda_{2})是二阶矩矩阵化简后的对角线上的元素。


    6.7 Harris detector example

    6.8 f value (red high, blue low)

    6.9 Threshold (f > value)

    6.10 Harris interest points

    6.11 Other interest point detectors

    6.12 Hessian interest points

    6.13 Geometric Transformations

    6.14 Rotation Invariance

    6.15 Automatic scale selection

    6.16 Scale Space

    6.17 Automatic scale selection

    6.18 Scale invariance – Normalization


    6.19 Laplacian of Gaussian (LoG)


    6.20 Harris-Laplace (HarLap)




    7. Matching single view

    7.1 Evaluation




    7.2 Local Descriptors / Features

    7.3 Local Descriptors

    1. Distinctiveness:
    2. Invariance:
    3. Robustness:

      总结:


    7.4 Scale Invariant Feature Transform (SIFT)

    7.5 SIFT Descriptor

    7.6 Properties of SIFT

    7.7 Shape Context

    7.8 Evaluating the results

    7.9 True/false positives

    7.10 Evaluating the results


    7.11 Projecting a planar object

    7.12 Geometric intuition

    7.13 Projecting a plane


    7.14 Rotating camera

    7.15 Estimating the Homography



    2N×9矩阵和9×1矩阵推导
    https://blog.csdn.net/lyhbkz/article/details/82254893



    对u进行SVD,即u = (U Sigma V^T), (V^T)即是(ar{H})

    7.16 Robust Estimation

    8. two view

    8.1 Robust Estimation





    表格白色中的数值代表迭代次数

    8.2 Estimating the homography

    8.3 Application: AutoStitch




    8.4 Next goal: Recovery of 3D structure


    8.5 What is Stereo (Vision)?

    8.6 Depth from Stereo


    8.7 Triangulation



    8.8 Triangulation: Geometric midpoint

    8.9 Triangulation: Linear approach

    8.10 Side note

    8.11 Triangulation: non-linear approach


    8.12 Epipolar Geometry






    8.13 Example: Converging Cameras

    8.14 Example: Motion Parallel to Image Plane

    8.13 Epipolar Geometry



    t是平移矩阵,R是旋转矩阵。
    (O_{1}O_{2})是t,表示(O_1)经过平移成为(O_2)
    (O_{2}p_{2})(Rp_{2})相当于(O_{1}p_{2})旋转后成为(O_{2}p_{2})

    9. stereo

    9.1 Binocular Stereo: Parallel cameras


    t( imes)R的结果推导:
    计算叉乘矩阵:

    9.2 Uncalibrated Cameras



    9.3 Fundamental matrix


    9.4 Estimating the fundamental matrix

    9.5 Eight-point algorithm



    9.6 Estimating the fundamental matrix

    9.7 Example





    9.8 What can we do with 2 views?



    9.9 Stereo Vision — Easier problem

    9.10 Binocular Stereo

    9.11 Binocular Stereo — Disparity








    9.12 Triangulation


    x是左边O的坐标系
    (x^prime)是右边(O^prime)的坐标系

    9.13 Stereo rectification



    9.14 Stereo Correspondence

    9.15 Correspondence problem


    9.16 Normalized Correlation

    9.17 Even simpler: Sum of Squared (Pixel) Differences

    9.18 Window-based matching

    9.19 Convert disparity to depth

    9.20 Window-based matching

    9.21 Influence of window size

    9.22 The similarity constraint

    9.23 Limitations of similarity constraint

    10. Motion

    10.1 Non-local constraints




    10.2 Scanline stereo

    10.3 "Shortest path" for scanline stereo

    10.4 Coherent stereo on 2D grid

    10.5 Motion Field

    10.6 Optical Flow

    10.7 Optical Flow Field

    10.8 Optical Flow Estimation

    10.9 How do we compute Optical Flow?

    10.10 Brightness Constancy


    10.11 Spatial Coherence

    10.12 Minimize Brightness Difference

    10.13 Sum of Squared Differences

    10.14 Simple Flow Estimation Algorithm

    10.15 Can we approximate this?


    10.16 Optical Flow Constraint Equation


    10.17 Notation

    10.18 OFCE

    10.19 Aperture Problem







    10.20 Multiple Constraints


    10.21 Area-Based Flow

    10.22 Optimization




    10.23 Structure tensor

    10.24 Solving for u


    u是((u,v)^{t})向量

    10.25 SSD Surface - Textured area

    10.26 SSD Surface - Single Edge

    10.27 SSD Surface - Surface Boundary

    11. Recognition

    11.1 Image registration

    11.2 Image Warping

    11.3 Forward Warping


    11.4 Inverse Warping


    11.5 Interpolation

    11.6 Warping

    11.7 Image registration revisited

    11.8 Dense LK Flow

    11.9 Iterative Estimation

    11.10 Iterative Estimation





    11.11 Coarse-to-fine Estimation

    11.12 Is that a good result?

    11.13 Aside: How to get the ground truth?

    11.14 What is the problem?

    11.15 Shift Gears: Object Recognition

    11.16 Example of Recognition & Localization

    11.17 Recognition problems

    11.18 Search and Recognition

    11.19 Naive View-Based Approach

    11.20 From faces and mouths to objects

    11.21 Appearance-Based Instance Recognition

    11.22 Challenges

    11.23 Global Representation

    11.24 View-Based Approaches

    11.25 Feature representation

    11.26 Recognition using Histograms


    11.27 Histogram Comparison




    11.28 Recognition using Histograms

    11.29 Color Histograms

    11.30 Discussion: Color Histograms

    11.31 Generalization of the Idea


    11.32 Multidimensional Histograms

    11.33 Recognition Results


    11.34 Summary

    11.35 “Bag of words” Model

    11.36 Analogy to Documents

    11.37 Visual word distributions

    11.38 Bag of Words

    11.39 Bag-of-Words Model: Overview

    12. bow

    12.1 BoW-1. Feature detection and representation


    12.2 BoW-2. Codeword dictionary formation


    12.3 Vector quantization

    12.4 K-Means




    12.5 BoW-2. Codeword dictionary formation

    12.6 Image patch examples of codewords

    12.5 BoW-3. Image representation

    12.6 Next: Actual Recognition

    12.7 Excursion into Machine Learning

    12.8 Classification vs. Regression


    12.9 General Paradigm

    12.10 Bayesian Decision Theory








    Bayes optimal classifier:

    12.11 Discriminative vs. generative

    12.12 Discriminant Functions

    12.13 Which Hyperplane is Best? and Why?

    12.14 Support Vector Machines

    12.15 Toward Neural Networks

    12.16 Sigmoid

    12.17 Multi-Class Network

    12.18 Learning

    12.19 Gradient Descent


    k = l 时消去求和符号,当x = (x_j)时的求导,所以是(x_j)

    12.20 Stochastic Gradient Descent

    12.21 Gradient Descent


    推导过程参见44页。

    13 CNN

    13.1 Bag-of-Words Model: Overview

    13.2 Next: Actual Recognition

    13.3 Machine Learning: General Paradigm

    13.4 Discriminative vs. generative

    13.5 Discriminant Functions

    13.6 Toward Neural Networks

    13.7 Sigmoid

    13.8 Multi-Class Network

    13.9 Multi-Class Network for Classification

    13.10 Learning

    13.11 Gradient Descent






    13.12 Stochastic Gradient Descent

    13.13 Multi-Layer Perceptron


    (z_0 = 1)为偏差

    13.14 Learning with Gradient Descent


    13.15 Backpropagation Algorithm

    13.16 Robot Navigation (a.k.a. “deep driving”)

    13.17 Bag-of-Words Model: Done!

    13.18 Summary & Discussion

    13.19 What about spatial information?

    13.20 Problem with bag-of-words

    13.21 Model: Parts and Structure

    13.22 Part-based Representation

    13.23 The correspondence problem

    13.24 Pyramid match kernel

    13.25 Convolutional Neural Networks

    13.26 CNN Architecture

    13.27 History: Multistage Hubel-Wiesel Architecture

    13.28 Overview of ConvNets

    13.29 Krizhevsky et al. [NIPS 2012]

    13.30 Using Features on Other Datasets

    13.31 Caltech 256


    13.32 Components of Each Layer

    13.33 Filtering

    13.34 Non-Linearity


    13.35 Pooling

    13.36 Role Of Pooling

    13.37 Receptive Field

    13.38 Components of Each Layer

    13.39 Batch Normalization

    13.40 Loss function

    13.41 Initialization

    13.42 How to Choose Architecture

    13.43 AlexNet Architecture





    13.44 Tapping off Features at each Layer

    13.45 Translation (Vertical)

    13.46 Scale Invariance

    13.47 Rotation Invariance

    13.48 Very Deep Models: VGG

    13.49 Very Deep Models: GoogLeNet

    13.50 Residual Networks

    13.51 Training Big ConvNets

    13.52 Annealing of Learning Rate

    13.53 Adversarial Examples

    13.54 What if it does not work?

    13.55 Deep Learning for Computer Vision


    作者:Rest探路者
    出处:http://www.cnblogs.com/Java-Starter/
    本文版权归作者和博客园共有,欢迎转载,但未经作者同意请保留此段声明,请在文章页面明显位置给出原文连接
    Github:https://github.com/cjy513203427

  • 相关阅读:
    [LeetCode]2. Add Two Numbers链表相加
    Integration between Dynamics 365 and Dynamics 365 Finance and Operation
    向视图列添加自定义图标和提示信息 -- PowerApps / Dynamics365
    Update the Power Apps portals solution
    Migrate portal configuration
    Use variable to setup related components visible
    Loyalty management on Retail of Dynamic 365
    Modern Fluent UI controls in Power Apps
    Change screen size and orientation of a canvas app in Power App
    Communication Plan for Power Platform
  • 原文地址:https://www.cnblogs.com/Java-Starter/p/13923887.html
Copyright © 2011-2022 走看看