zoukankan      html  css  js  c++  java
  • PRML 5: Kernel Methods

      A kernel function implicitly maps a data point into some high-dimensional feature space and substitutes for the inner product of two feature vectors, so that a non-linearly separable classification problem can be converted into a linearly separable one. This trick can be applied to many feature vector-based models such as SVM, which we have introduced in previous articles.

     

      To test the validity of a kernel function, we need the Mercer Theorem: function $k:mathbb{R}^m imesmathbb{R}^m ightarrowmathbb{R}$ is a Mercer kernel iff for all finite sets ${vec{x}_1,vec{x}_2,...,vec{x}_n}$, the corresponding kernel matrix is proved to be symmetric positive semi-definite.

      One of the good kernel functions is the Gaussian kernel $k(vec{x}_m,vec{x}_n)=exp{-frac{1}{2sigma^2}||vec{x}_m-vec{x}_n||^2}$, which has infinite dimensionality. Another one is the polynomial kernel $k(vec{x}_m,vec{x}_n)=(vec{x}_m^Tvec{x}_n+c)^M$ with $c>0$. In reality, we can construct a new kernel function with some simple valid kernels according to some properties.

      We can also use a generative model to define kernel functions, such as:

      (1) $k(vec{x}_m,vec{x}_n)=int p(vec{x}_m ext{ | }vec{z})cdot p(vec{x}_n ext{ | }vec{z})cdot p(vec{z})cdot dvec{z}$, where $vec{z}$ is a latent variable;

      (2) $k(vec{x}_m,vec{x}_n)=g(vec{ heta},vec{x})^TF^{-1}g(vec{ heta},vec{x})$, where $g(vec{ heta},vec{x})=igtriangledown_{vec{ heta}}ln{p(vec{x} ext{ | }vec{ heta})}$ is the Fisher score,

       and  $F=frac{1}{N}sum_{n=1}^N g(vec{ heta},vec{x}_n)g(vec{ heta},vec{x}_n)^T$ is the Fisher information matrix.

      Gaussian Process is a probabilistic discriminative model, where an assumption is made that the set of values of $y(x)$ evaluated at an arbitrary set of points ${vec{x}_1,vec{x}_2,...,vec{x}_N}$ is jointly Gaussian distributed. Here we use the kernel matrix to determine the covariance.

      Gaussian Process for Regression:

      Typically, we choose  $k(vec{x}_m,vec{x}_n)= heta_0 exp{-frac{ heta_1}{2}||vec{x}_n-vec{x}_m||^2}+ heta_2+ heta_3 vec{x}_m^Tvec{x}_n$, and assume that:

      (1) prior distribution  $p(vec{y}_N)=Gauss(vec{y}_N ext{ | }vec{0},K_N)$;
      (2) likelihood    $p(vec{t}_N ext{ | }vec{y}_N)=Gauss(vec{t}_N ext{ | }vec{y}_N,eta^{-1}I_N)$.

      Then, we have $p(vec{t}_N)=int p(vec{t}_N ext{ | }vec{y}_N)cdot p(vec{y}_N)cdot dvec{y}_N=Gauss(vec{t}_N ext{ | }vec{0},K_N+eta^{-1}I_N)$. Here, $p(vec{t}_N)$ is the likelihood of hyperparameter $vec{ heta}$, and we can make an MLE to learn $vec{ heta}$.

      Also, $p(vec{t}_{N+1})=Gauss(vec{t}_{N+1} ext{ | }vec{0},K_{N+1}+eta^{-1}I_{N+1})$. Hence, denote $vec{k}=[k(vec{x}_1,vec{x}_{N+1}),k(vec{x}_2,vec{x}_{N+1}),...,k(vec{x}_N,vec{x}_{N+1})]^T$, then we can get the conditional Gaussian  $p(vec{t}_{N+1} ext{ | }vec{t}_N) = Gauss(vec{k}^T(K_N+eta^{-1}I_N)^{-1}vec{t}_N,k(vec{x}_{N+1},vec{x}_{N+1})-vec{k}^T(K_N+eta^{-1}I_N)^{-1}vec{k}+eta^{-1})$

      Gaussian Process for Classification:

      We make an assumption that $p(t_N ext{ | }a_N)=sigma(a_N)$, and take the following steps:

      (1) Calculate $p(vec{a}_N ext{ | }vec{t}_N)$ by Laplace approximation;

      (2) Given $p(vec{a}_N ext{ | }vec{t}_N)$ and $p(vec{a}_{N+1} ext{ | }vec{t}_{N+1})$, $p(a_{N+1} ext{ | }vec{a}_N)$ is a conditional Gaussian;

      (3) $p(a_{N+1} ext{ | }vec{t}_N)=int p(a_{N+1} ext{ | }vec{a}_N)cdot p(vec{a}_N ext{ | }vec{t}_N)cdot dvec{a}_N$;

      (4) $p(t_{N+1} ext{ | }vec{t}_N)=int sigma(a_{N+1})cdot p(a_{N+1} ext{ | }vec{t}_N)cdot dvec{a}_{N+1}$.

    References:

      1. Bishop, Christopher M. Pattern Recognition and Machine Learning [M]. Singapore: Springer, 2006

     

  • 相关阅读:
    1094. Car Pooling
    121. Best Time to Buy and Sell Stock
    58. Length of Last Word
    510. Inorder Successor in BST II
    198. House Robber
    57. Insert Interval
    15. 3Sum java solutions
    79. Word Search java solutions
    80. Remove Duplicates from Sorted Array II java solutions
    34. Search for a Range java solutions
  • 原文地址:https://www.cnblogs.com/DevinZ/p/4575988.html
Copyright © 2011-2022 走看看