zoukankan      html  css  js  c++  java
  • Sherman-Morrison公式及其应用

    Sherman-Morrison公式

      Sherman-Morrison公式以 Jack Sherman 和 Winifred J. Morrison命名,在线性代数中,是求解逆矩阵的一种方法。本篇博客将介绍该公式及其应用,首先我们来看一下该公式的内容及其证明。
      (Sherman-Morrison公式)假设(Ainmathbb{R}^{n imes n})为可逆矩阵,(u,vinmathbb{R}^{n})为列向量,则(A+uv^{T})可逆当且仅当(1+v^{T}A^{-1}u eq 0), 且当(A+uv^{T})可逆时,该逆矩阵由以下公式给出:

    [(A+uv^{T})^{-1}=A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}. ]

    证明:
    ((Leftarrow))(1+v^{T}A^{-1}u eq 0)时,令(X=A+uv^{T}, Y=A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}),则只需证明(XY=YX=I)即可,其中(I)为n阶单位矩阵。

    [{displaystyle {egin{aligned} XY&=(A+uv^{T})left(A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u} ight)\ &=AA^{-1}+uv^{T}A^{-1}-{AA^{-1}uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\ &=I+uv^{T}A^{-1}-{uv^{T}A^{-1}+uv^{T}A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}\ &=I+uv^{T}A^{-1}-{u(1+v^{T}A^{-1}u)v^{T}A^{-1} over 1+v^{T}A^{-1}u}\ &=I+uv^{T}A^{-1}-uv^{T}A^{-1}\ &=Iend{aligned}}} ]

    同理,有(YX=I).因此,当(1+v^{T}A^{-1}u eq 0)时,((A+uv^{T})^{-1}=A^{-1}-{A^{-1}uv^{T}A^{-1} over 1+v^{T}A^{-1}u}.)
    ((Rightarrow))(u=0)时,显然有(1+v^{T}A^{-1}u=1 eq 0.)(u eq0)时,用反正法证明该命题成立。假设(A+uv^{T})可逆,但(1+v^{T}A^{-1}u = 0),则有

    [(A+uv^{T})A^{-1}u=u+u(v^{T}A^{-1}u)=(1+v^{T}A^{-1}u)u=0. ]

    因为(A+uv^{T})可逆,故(A^{-1})u=0,又因为(A^{-1})可逆,故(u=0),此与假设(u eq 0)矛盾。因此,当(A+uv^{T})可逆时,有(1+v^{T}A^{-1}u eq 0.)

    Sherman-Morrison公式的应用

    应用1:(A=I)时的Sherman-Morrison公式

      在Sherman-Morrison公式中,令(A=I),则有:(I+uv^{T})可逆当且仅当(1+v^{T}u eq 0), 且当(I+uv^{T})可逆时,该逆矩阵由以下公式给出:

    [(I+uv^{T})^{-1}=I-{uv^{T} over 1+v^{T}u}. ]

      再令(v=u),则(1+u^{T}u > 0), 因此,(I+uu^{T})可逆,且

    [(I+uu^{T})^{-1}=I-{uu^{T} over 1+u^{T}u}. ]

    应用2:BFGS算法

      Sherman-Morrison公式在BFGS算法中的应用,可用来求解BFGS算法中近似Hessian矩阵的逆。本篇博客并不打算给出Sherman-Morrison公式在BFGS算法中的应用,将会再写篇博客介绍BFGS算法,到时再给出该公式的应用,并会在之后补上该博客的链接(因为笔者还没写)。

    应用3:循环三对角线性方程组的求解

      本篇博客将详细讲述Sherman-Morrison公式在循环三对角线性方程组的求解中的应用。
      首先给给出理论知识介绍部分。
      对于(Ainmathbb{R}^{n imes n})为可逆矩阵,(u,vinmathbb{R}^{n})为列向量,(1+v^{T}A^{-1}u eq 0),需要求解方程((A+uv^{T})x=b.)对此,我们可以先求解以下两个方程:

    [Ay=b,qquad Az=u$$. 然后令$x=y-frac{v^{T}y}{1+v^{T}z}z$,该解即为原方程的解,验证如下: ]

    {displaystyle
    {egin{aligned}
    (A+uv^{T})x&=(A+uv^{T})(y-frac{v^{T}y}{1+v^{T}z}z)
    &=Ay+uv^{T}y-frac{v^{T}y}{1+v^{T}z}Az-frac{v^{T}y}{1+v^{T}z}uv^{T}z
    &=b+uv^{T}y-frac{v^{T}yu+v^{T}yuv^{T}z}{1+v^{T}z}
    &=b+uv^{T}y-frac{(1+v^{T}z)v^{T}yu}{1+v^{T}z}
    &=b+uv^{T}y-uv^{T}y
    &=bend{aligned}}}

    [  这样将原方程$ (A+uv^{T})x=b$分成两个方程,可以在一定程度上简化原方程。接下来,我们将介绍循环三对角线性方程组的求解。   所谓循环三对角线性方程组,指的是系数矩阵为如下形式: ]

    A=egin{bmatrix}
    b_1&c_1&0&cdots&0&a_1
    a_2&b_2&c_2&0&vdots&0
    0&ddots&ddots&ddots&0&vdots
    vdots&vdots&a_{n-2}&b_{n-2}&c_{n-2}&0
    0&cdots&cdots&a_{n-1}&b_{n-1}&c_{n-1}
    c_n&0&cdots&0&a_n&b_nend{bmatrix}

    [循环三对角线性方程组可写成$Ax=d$,其中$d=(d_{1},d_2,...,d_{n})^{T}.$   对于此方程的求解,我们令$u=(gamma, 0,0,...,c_{n})^{T}, v=(1,0,0,...,frac{a_1}{gamma})^{T}$, 且$A=A^{'}+uv^{T}$,其中$A^{'}$如下: ]

    A^{'}=egin{bmatrix}
    b_1-gamma&c_1&0&cdots&0&0
    a_2&b_2&c_2&0&vdots&0
    0&ddots&ddots&ddots&0&vdots
    vdots&vdots&a_{n-2}&b_{n-2}&c_{n-2}&0
    0&cdots&cdots&a_{n-1}&b_{n-1}&c_{n-1}
    0&0&cdots&0&a_n&b_n-frac{a_1c_n}{gamma}end{bmatrix}

    [$A^{'}$为三对角矩阵。根据以上的理论知识,我们只需要求解以下两个方程 $$A^{'}y=d,qquad A^{'}z=u,]

    然后,就能根据(y,z)求出(x).而以上两个方程为三对角线性方程组,可以用追赶法(或Thomas法)求解,具体算法可以参考博客:三对角线性方程组(tridiagonal systems of equations)的求解 。
      综上,我们利用Sherman-Morrison公式的思想,可以将循环三对角线性方程组转化为三对角线性方程组求解。我们将会在下面给出该算法的Python语言实现。

    Python实现

      我们要解的循环三对角线性方程组如下:

    [{egin{bmatrix} 4&1&{0}&{0}&{2}\ {1}&{4}&{1}&{0}&{0}\ {0}&{1}&{4}&{1}&{0}\ {0}&{0}&{1}&{4}&{1}\ {3}&{0}&{0}&{1}&{4}\ end{bmatrix}} {egin{bmatrix}{x_{1}}\{x_{2}}\{x_{3}}\{x_{4}} \{x_{5}}\end{bmatrix}}={egin{bmatrix}{7\6\ 6\6\8}\end{bmatrix}} ]

      用Python实现解该方程的Python完整代码如下:

    # use Sherman-Morrison Formula and Thomas Method to solve cyclic tridiagonal linear equation
    
    import numpy as np
    
    # Thomas Method for soling tridiagonal linear equation Ax=d
    # parameter: a,b,c,d are list-like of same length
    # b: main diagonal of matrix A
    # a: main diagonal below of matrix A
    # c: main diagonal upper of matrix A
    # d: Ax=d
    # return: x(type=list), the solution of Ax=d
    def TDMA(a,b,c,d):
    
        try:
            n = len(d)    # order of tridiagonal square matrix
    
            # use a,b,c to create matrix A, which is not necessary in the algorithm
            A = np.array([[0]*n]*n, dtype='float64')
    
            for i in range(n):
                A[i,i] = b[i]
                if i > 0:
                    A[i, i-1] = a[i]
                if i < n-1:
                    A[i, i+1] = c[i]
    
            # new list of modified coefficients
            c_1 = [0]*n
            d_1 = [0]*n
    
            for i in range(n):
                if not i:
                    c_1[i] = c[i]/b[i]
                    d_1[i] = d[i] / b[i]
                else:
                    c_1[i] = c[i]/(b[i]-c_1[i-1]*a[i])
                    d_1[i] = (d[i]-d_1[i-1]*a[i])/(b[i]-c_1[i-1] * a[i])
    
            # x: solution of Ax=d
            x = [0]*n
    
            for i in range(n-1, -1, -1):
                if i == n-1:
                    x[i] = d_1[i]
                else:
                    x[i] = d_1[i]-c_1[i]*x[i+1]
    
            x = [round(_, 4) for _ in x]
    
            return x
    
        except Exception as e:
            return e
    
    # Sherman-Morrison Fomula for soling cyclic tridiagonal linear equation Ax=d
    # parameter: a,b,c,d are list-like of same length
    # b: main diagonal of matrix A
    # a: main diagonal below of matrix A
    # c: main diagonal upper of matrix A
    # d: Ax=d
    # return: x(type=list), the solution of Ax=d
    def Cyclic_Tridiagnoal_Linear_Equation(a,b,c,d):
        try:
            # use a,b,c to create cyclic tridiagonal matrix A
            n = len(d)
            A = np.array([[0] * n] * n, dtype='float64')
    
            for i in range(n):
                A[i, i] = b[i]
                if i > 0:
                    A[i, i - 1] = a[i]
                if i < n - 1:
                    A[i, i + 1] = c[i]
            A[0, n - 1] = a[0]
            A[n - 1, 0] = c[n - 1]
    
            gamma = 1 # gamma can be set freely
            u = [gamma] + [0] * (n - 2) + [c[n - 1]]
            v = [1] + [0] * (n - 2) + [a[0] / gamma]
    
            # modify the coefficient to form A'
            b[0] -= gamma
            b[n - 1] -= a[0] * c[n - 1] / gamma
            a[0] = 0
            c[n - 1] = 0
    
            # solve A'y=d, A'z=u by using Thomas Method
            y = np.array(TDMA(a, b, c, d))
            z = np.array(TDMA(a, b, c, u))
    
            # use y and z to calculate x
            # x = y-(v·y)/(1+v·z) *z
            # x is the solution of Ax=d
            x = y - (np.dot(np.array(v), y)) / (1 + np.dot(np.array(v), z)) * z
    
            x = [round(_, 3) for _ in x]
    
            return x
    
        except Exception as e:
            return e
    
    def main():
        '''
           equation:
           A = [[4,1,0,0,2],
                [1,4,1,0,0],
                [0,1,4,1,0],
                [0,0,1,4,1],
                [3,0,0,1,4]]
           d = [7,6,6,6,8]
           solution x should be [1,1,1,1,1]
        '''
    
        a = [2, 1, 1, 1, 1]
        b = [4, 4, 4, 4, 4]
        c = [1, 1, 1, 1, 3]
        d = [7, 6, 6, 6, 8]
    
        x = Cyclic_Tridiagnoal_Linear_Equation(a,b,c,d)
        print('The solution is %s'%x)
    
    main()
    

    输出结果如下:

    The solution is [1.0, 1.0, 1.0, 1.0, 1.0]

    参考文献

    1. https://en.wikipedia.org/wiki/Sherman–Morrison_formula
    2. http://wwwmayr.in.tum.de/konferenzen/Jass09/courses/2/Soldatenko_paper.pdf
    3. https://scicomp.stackexchange.com/questions/10137/solving-system-of-linear-equations-with-cyclic-tridiagonal-matrix
    4. https://blog.csdn.net/jclian91/article/details/80251244

    注意:本人现已开通两个微信公众号: 用Python做数学(微信号为:python_math)以及轻松学会Python爬虫(微信号为:easy_web_scrape), 欢迎大家关注哦~~

  • 相关阅读:
    android创建自定义对话框
    激光裁剪的商务名片
    60个响应式的Web设计教程–能够手机访问!
    前端性能优化:使用Data URI代替图片SRC
    40个最好的Tumblr主题
    40个超酷的jQuery动画效果教程
    15个最好的jQuery timeline插件
    60+富有创意的宣传册设计
    Eclipse安装SVN
    EXPLAIN 关键字可以 查看 sql执行 的详细过程
  • 原文地址:https://www.cnblogs.com/jclian91/p/9132834.html
Copyright © 2011-2022 走看看