Matrix Derivatives

Types of Matrix Derivative

TypeScalar Vector Matrix
Scalar : :
Vector : :
Matrix :

Dimension์„ ์ฃผ์˜ํ•  ๊ฒƒ!

Gradient and Hessian

  • = the gradient of
    • The transpose of the first derivatives of
  • = the Hessian of
    • The matrix of second partial derivatives of
    • The Hessian is a symmetric matrix

Jacobian and Matrix Derivative

  • Jacobian when
  • Matrix derivative when

Useful Matrix Derivative

For ,

    • if is symmetric.

Chain Rule

Chain Rule

Theorem: Chain Rule When the vector in turn depens on another vector , the chain rule for the univariate function can be extended as follows:

  • If and where , then

(gradients from all possible paths)

  • or in vector notation

Neural Net์—์„œ์˜ BackPropagation ๊ธฐ๋ฒ•์˜ ๊ธฐ์ดˆ๊ฐ€ ๋œ๋‹ค.

Chain Rule on Level Curve

  • level curve : ๋ฅผ ๋งŒ์กฑํ•˜๋Š” ์˜ ์ง‘ํ•ฉ.

  • On level curve ,

์ฆ‰, ๋Š” level curve์—์„œ ์ˆ˜์ง(orthogonal)์ด๋ฉฐ, ๊ฐ€ ์ฆ๊ฐ€ํ•˜๋Š” ๋ฐฉํ–ฅ(ascent direction)์„ ๊ฐ€๋ฅดํ‚จ๋‹ค.

Directional Derivatives

  • is continuously differentiable and , directional derivative of in the direction of is given by

Taylor Series Expansion

  • First order
  • Second order

์ถ”ํ›„ ๋‚˜์˜ฌ ์ผ๋ฐ˜์ ์ธ search(๋˜๋Š” learning) algorithm์—์„œ๋Š” 1st order expansion์ด๋ฉด ์ถฉ๋ถ„ํ•˜๋‹ค.

Taylor Series Expansion์„ ํ†ตํ•ด ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ฐ€ ascent direction์ž„์„ ๋ณด์ผ ์ˆ˜ ์žˆ๋‹ค.