参考:Matrix calculus - Wikipedia
矩阵求导(Matrix Derivative)也称作矩阵微分(Matrix Differential),在机器学习、图像处理、最优化等领域的公式推导中经常用到。
布局(Layout):在矩阵求导中有两种布局,分别为分母布局(denominator layout)和分子布局(numerator layout)。这两种不同布局的求导规则是不一样的。
个人理解:
Numerator Layout:布局按照分子的排列,例如分子的m列,那么结果的m列是对应分子的,与分母正好相反,分母如果为n列,对应的n行,比较常用。
Denominator Layout:与上面正好相反,结果正好是转置矩阵。
Numerator-layout notation
Using numerator-layout notation, we have:
$
{displaystyle {frac {partial y}{partial mathbf {x} }}=left[{frac {partial y}{partial x_{1}}}{frac {partial y}{partial x_{2}}}cdots {frac {partial y}{partial x_{n}}}
ight].}
$
$
{displaystyle {frac {partial mathbf {y} }{partial x}}={egin{bmatrix}{frac {partial y_{1}}{partial x}}\{frac {partial y_{2}}{partial x}}\vdots \{frac {partial y_{m}}{partial x}}\end{bmatrix}}.}
$
${displaystyle {frac {partial mathbf {y} }{partial mathbf {x} }}={egin{bmatrix}{frac {partial y_{1}}{partial x_{1}}}&{frac {partial y_{1}}{partial x_{2}}}&cdots &{frac {partial y_{1}}{partial x_{n}}}\{frac {partial y_{2}}{partial x_{1}}}&{frac {partial y_{2}}{partial x_{2}}}&cdots &{frac {partial y_{2}}{partial x_{n}}}\vdots &vdots &ddots &vdots \{frac {partial y_{m}}{partial x_{1}}}&{frac {partial y_{m}}{partial x_{2}}}&cdots &{frac {partial y_{m}}{partial x_{n}}}\end{bmatrix}}.}
$
$
frac{partial y}{partial mathbf{X}} = egin{bmatrix} frac{partial y}{partial x_{11}} & frac{partial y}{partial x_{21}} & cdots & frac{partial y}{partial x_{p1}}\ frac{partial y}{partial x_{12}} & frac{partial y}{partial x_{22}} & cdots & frac{partial y}{partial x_{p2}}\ vdots & vdots & ddots & vdots\ frac{partial y}{partial x_{1q}} & frac{partial y}{partial x_{2q}} & cdots & frac{partial y}{partial x_{pq}}\ end{bmatrix}.
$
The following definitions are only provided in numerator-layout notation:
$
frac{partial mathbf{Y}}{partial x} = egin{bmatrix} frac{partial y_{11}}{partial x} & frac{partial y_{12}}{partial x} & cdots & frac{partial y_{1n}}{partial x}\ frac{partial y_{21}}{partial x} & frac{partial y_{22}}{partial x} & cdots & frac{partial y_{2n}}{partial x}\ vdots & vdots & ddots & vdots\ frac{partial y_{m1}}{partial x} & frac{partial y_{m2}}{partial x} & cdots & frac{partial y_{mn}}{partial x}\ end{bmatrix}.
$
$
dmathbf{X} = egin{bmatrix} dx_{11} & dx_{12} & cdots & dx_{1n}\ dx_{21} & dx_{22} & cdots & dx_{2n}\ vdots & vdots & ddots & vdots\ dx_{m1} & dx_{m2} & cdots & dx_{mn}\ end{bmatrix}.
$
代码参考:
$$ {displaystyle {frac {partial y}{partial mathbf {x} }}=left[{frac {partial y}{partial x_{1}}}{frac {partial y}{partial x_{2}}}cdots {frac {partial y}{partial x_{n}}} ight].} $$ $$ {displaystyle {frac {partial mathbf {y} }{partial x}}={egin{bmatrix}{frac {partial y_{1}}{partial x}}\{frac {partial y_{2}}{partial x}}\vdots \{frac {partial y_{m}}{partial x}}\end{bmatrix}}.} $$ $${displaystyle {frac {partial mathbf {y} }{partial mathbf {x} }}={egin{bmatrix}{frac {partial y_{1}}{partial x_{1}}}&{frac {partial y_{1}}{partial x_{2}}}&cdots &{frac {partial y_{1}}{partial x_{n}}}\{frac {partial y_{2}}{partial x_{1}}}&{frac {partial y_{2}}{partial x_{2}}}&cdots &{frac {partial y_{2}}{partial x_{n}}}\vdots &vdots &ddots &vdots \{frac {partial y_{m}}{partial x_{1}}}&{frac {partial y_{m}}{partial x_{2}}}&cdots &{frac {partial y_{m}}{partial x_{n}}}\end{bmatrix}}.} $$ $$ frac{partial y}{partial mathbf{X}} = egin{bmatrix} frac{partial y}{partial x_{11}} & frac{partial y}{partial x_{21}} & cdots & frac{partial y}{partial x_{p1}}\ frac{partial y}{partial x_{12}} & frac{partial y}{partial x_{22}} & cdots & frac{partial y}{partial x_{p2}}\ vdots & vdots & ddots & vdots\ frac{partial y}{partial x_{1q}} & frac{partial y}{partial x_{2q}} & cdots & frac{partial y}{partial x_{pq}}\ end{bmatrix}. $$ $$ frac{partial mathbf{Y}}{partial x} = egin{bmatrix} frac{partial y_{11}}{partial x} & frac{partial y_{12}}{partial x} & cdots & frac{partial y_{1n}}{partial x}\ frac{partial y_{21}}{partial x} & frac{partial y_{22}}{partial x} & cdots & frac{partial y_{2n}}{partial x}\ vdots & vdots & ddots & vdots\ frac{partial y_{m1}}{partial x} & frac{partial y_{m2}}{partial x} & cdots & frac{partial y_{mn}}{partial x}\ end{bmatrix}. $$ $$ dmathbf{X} = egin{bmatrix} dx_{11} & dx_{12} & cdots & dx_{1n}\ dx_{21} & dx_{22} & cdots & dx_{2n}\ vdots & vdots & ddots & vdots\ dx_{m1} & dx_{m2} & cdots & dx_{mn}\ end{bmatrix}. $$