This is the 8th day of my participation in the First Challenge 2022. For details: First Challenge 2022.

@TOC

preface

Hello! Friend!!! ଘ(੭, ᵕ)੭ Nickname: Haihong Name: program monkey | C++ player | Student profile: Because of C language, I got acquainted with programming, and then transferred to the computer major, and had the honor to win some state awards, provincial awards… Has been confirmed. Currently learning C++/Linux/Python learning experience: solid foundation + more notes + more code + more thinking + learn English! Machine learning small white stage article only as their own learning notes for the establishment of knowledge system and review know why!

The articles

Matrix theory for Machine Learning (1) : Sets and Mappings

Matrix Theory for Machine Learning (2) : Definitions and Properties of linear Spaces

Matrix theory (3) : Bases and coordinates of linear Spaces

Matrix theory for Machine Learning (4) : Basis transformation and coordinate transformation

Matrix theory for Machine Learning (5) : Linear subspaces

Matrix theory (6) : Intersection and sum of subspaces

Matrix theory for Machine Learning (7) : Euclidean Spaces

Matrix theory (8) : Orthonormal basis and Gram-Schmidt Process

Matrix theory for Machine Learning (9) : Orthogonal complement and projection theorem

Matrix theory for Machine Learning (10) : Definition of linear Transformations

Matrix theory (11) : Matrix representation of linear transformations

Matrix theory for Machine Learning (12) : Approximation theory

Matrix theory for Machine Learning (13) : Hamliton-Cayley theorem, minimum polynomials

Matrix theory for Machine Learning (14) : Vector norm and Its Properties

Matrix theory (15) : The norm of matrices

Matrix theory (16) : The Limits of vectors and matrices

Matrix theory (17) : Differential and integral functions of matrices

Matrix theory for Machine Learning (18) : Power series for square matrices

Matrix Theory for Machine Learning (19) : Indefinite Integrals

5.4 Square matrix function

5.4.1 Square matrix function
f ( A ) f(A)
define

The simplest square matrix function is a matrix polynomial


B = f ( A ) = a 0 E + a 1 A + . . . + a n A n B=f(A)=a_0E+a_1A+… +a_nA^n

Where A∈Cn×n, AI ∈CA\in C^{n×n},a_i\in CA∈Cn×n, AI ∈C

F (x)=a0+a1x+… +anxnf(x) = a_0 + a_1x + … + a_n x^nf(x)=a0+a1x+… +anxn Here, the variable XXX becomes the matrix AAA to get: f(A)=a0E+a1A+… +anAnf(A)=a_0E+a_1A+… +a_nA^nf(A)=a0E+a1A+… +anAn

5.4.2 with square
A A
To calculate the square matrix function
f ( A ) f(A)

Theorem 5.4.1

If a square matrix X ∈ Cn X nX \ in C ^ {n * n} Cn X X ∈ n power series of ∑ k = 0 up akXk \ sum_ {k = 0} ^ {\ infty} a_kX ^ {k} ∑ k = 0 up akXk convergence, and to remember


f ( X ) = k = 0 up a k X k f(X)=\sum_{k=0}^{\infty}a_kX^{k}

when


X = d i a g ( X 1 . X 2 . . . . . X t ) X = diag(X_1,X_2,… ,X_t)

There are


f ( X ) = f ( d i a g ( X 1 . X 2 . . . . . X t ) ) = d i a g ( f ( X 1 ) . f ( X 2 ) . . . . . f ( X t ) ) f(X)=f(diag(X_1,X_2,… ,X_t))=diag(f(X_1),f(X_2),… ,f(X_t))

Diagdiagdiag: Diagonal matrix

If XXX can be transformed into a diagonal matrix,f(X)f(X)f(X) can be equivalent to diag(f(X1),f(X2)… ,f(Xt))diag(f(X_1),f(X_2),… ,f(X_t))diag(f(X1),f(X2),… ,f(Xt))

Theorem 5.4.2

Give any complex power series with radius of convergence RRR


f ( z ) = k = 0 up a k z k f(z)=\sum_{k=0}^{\infty}a_kz^{k}

And an NNN order JordanJordanJordan block

when
Lambda. 0 < R |\lambda_0| < R
When the series


k = 0 up a k J 0 k \sum_{k=0}^{\infty}a_kJ_{0}^{k}

Absolute convergence, and

If XXX cannot be transformed into a diagonal matrix, but can be transformed into a matrix f(X), f(X), f(X) can be equivalent to the matrix shown above

using
A A
Calculate square matrix function of standard shape
f ( A ) f(A)

When AAA is similar to the diagonal, the reversible matrix PPP exists, so that


A = P [ d i a g ( Lambda. 1 . Lambda. 2 . . . . . Lambda. n ) ] P 1 A=P[diag(\lambda_1,\lambda_2,…,\lambda_n)]P^{-1}

For complex power series


f ( z ) = k = 0 up a k z k z < R f(z)=\sum_{k=0}^{\infty}a_{k}z^{k}\quad |z|<R

When ρ(A)


f ( A ) = k = 0 up a k A k f(A)=\sum_{k=0}^{\infty}a_kA^{k}

Convergence and


f ( A ) = f ( P [ d i a g ( Lambda. 1 . Lambda. 2 . . . . . Lambda. n ) ] P 1 ) = P [ d i a g ( f ( Lambda. 1 ) . f ( Lambda. 2 ) . . . . . f ( Lambda. n ) ) ] P 1 f(A)=f(P[diag(\lambda_1,\lambda_2,…,\lambda_n)]P^{-1})=P[diag(f(\lambda_1),f(\lambda_2),…,f(\lambda_n))]P^{-1}


Ex. : Let A=[010−2]A=\begin{bmatrix} 0&1 \\ 0&-2 \end{bmatrix}A=[001−2], EA, sin (A), cos (A) e ^ {A}, sin (A), cos (A) eA, sin (A), cos (A)


The square matrix functions encountered in general are usually not functions of the constant matrix AAA

It’s a function of the variable TTT and the matrix AtAtAt

The square matrix function e^{At},sin(At),cos(At)e ^{At},sin(At),cos(At)e ^ (At), sin(At),cos(At)e^ (At), sin(At),cos(At)

When AAA is not similar to the diagonal matrix, the invertible matrix PPP exists, so that


Example: Let A=[010001230]A=\begin{bmatrix} 0&1&0 \\ 0&0&1 \\ 2&3&0 \end{bmatrix}A=⎣⎢, repeat2103010 x ⎥⎤, find eAe^{A}eA

If it is o
e A t e^{At}
, it is

5.4.3 with
f ( z ) f(z)
in
A A
Square matrix function is calculated by spectral value method
f ( A ) f(A)

H (lambda) h (\ lambda) h (lambda) is limited polynomial, m (lambda) m (\ lambda) m (lambda) is the minimum polynomial phalanx AAA (the deg (lambda) [m] = tdeg (\ lambda) [m] = tdeg (lambda) [m] = t), With m (lambda) m (\ lambda) m (lambda) to remove h (lambda) h (\ lambda) h (lambda), its business for g (lambda) g (\ lambda) g (lambda), yu type r (lambda) r (\ lambda) r (lambda), there is


h ( Lambda. ) = m ( Lambda. ) g ( Lambda. ) + r ( Lambda. ) h(\lambda)=m(\lambda)g(\lambda)+r(\lambda)

With deg (lambda) [r] t – 1 or less deg (\ lambda) [r] \ leq t – 1 deg (lambda) [r] t – 1 or less, or r (lambda) = 0 r (\ lambda) = 0 r (lambda) = 0

Deg (lambda) [m] deg (\ lambda) [m] deg (lambda) [m] : Polynomial m (lambda) m (\ lambda) maximum number of m (lambda) Such as m (lambda) = 4 lambda (\ lambda) 2 + lambda + 3 m = 4 \ lambda ^ 2 + \ lambda (lambda) = 4 lambda 2 + 3 m + lambda + 3 Deg (lambda) [m] = 2 deg (\ lambda) [m] = 2 deg (lambda) [m] = 2

By m = 0 m (A) (A) = 0 m (A) = 0, there is


h ( A ) = m ( A ) g ( A ) + r ( A ) h(A)=m(A)g(A)+r(A)

Note that an arbitrary polynomial h(A) H (A)h(A) of square matrix AAA can always be represented as A polynomial r(A)r(A)r(A) r(A) of degree up to t−1t-1t−1 of AAA

TTT is the degree of the minimum polynomial m(λ)m(\lambda)m(λ) of AAA

In other words, any finite degree polynomial h(A)h(A)h(A) of square matrix AAA can be divided by E,A… And the At – 1 e, A,… ,A^{t-1}E,A,… ,At−1 linear representation, and E,A… And the At – 1 e, A,… ,A^{t-1}E,A,… ,At−1 is linearly independent

R (A)r(A)r(A) is unique


Because square function f (A) = ∑ k = 0 up akAkf (A) = \ sum_ {k = 0} ^ {\ infty} a_kA ^ {k} f (A) = ∑ k = 0 up akAk power series expression of convergence, Ak(k≥t)A^{k}(k\geq t)Ak(k≥t) can be expressed as A polynomial of degree AAA not exceeding t− 1t-1T −1

Then f(A)f(A)f(A) can be expressed as the square matrix polynomial t (A) t (A) t (A) t (A) of degree t−1t-1t−1

Since any AkA^kAk can be expressed as a polynomial of up to t−1t-1t−1, A1+A2+… +AkA^1+A^{2}+… +A^{k}A1+A2+… The +Ak result must also be a polynomial of up to t−1t-1t−1

Definition 5.9

Let the minimum polynomial of NNN square matrix AAA be


m ( Lambda. ) = ( Lambda. Lambda. 1 ) t 1 ( Lambda. Lambda. 2 ) t 2 . . . . ( Lambda. Lambda. s ) t s m(\lambda)=(\lambda-\lambda_1)^{t1}(\lambda-\lambda_2)^{t2}…. (\lambda-\lambda_s)^{ts}

Including 1 lambda, lambda. 2,… , lambda s \ lambda_1 \ lambda_2,… , \ lambda_s lambda 1, lambda. 2,… λs are the distinct characteristic roots of AAA


If copmplex function f (z) f (z) f (z) and its derivative f (l) (z) f ^ {} (l) (z) (l) (z) in z = f lambda I (I = 1, 2,… Z, s) = \ lambda_i (I = 1, 2,… Z, s) = lambda I (I = 1, 2,… The derivative value at s), i.e

They are all finite values and are called functions
f ( z ) f(z)
In the square
A A
Is given on the spectrum of, and these values are called
f ( z ) f(z)
in
A A
On the spectrum of values

Theorem 5.4.3

Let the minimum polynomial of A∈Cn×nA\in C^{n×n}A∈Cn×n be


m ( Lambda. ) = ( Lambda. Lambda. 1 ) t 1 ( Lambda. Lambda. 2 ) t 2 . . . . ( Lambda. Lambda. s ) t s m(\lambda)=(\lambda-\lambda_1)^{t1}(\lambda-\lambda_2)^{t2}…. (\lambda-\lambda_s)^{ts}

The t1 + t2 +… +ts=tt_1+t_2+… +t_s=tt1+t2+… + ts = t, lambda I indicates lambda j (I indicates j, I, j = 1, 2,… ,s)\lambda_i\neq\lambda_j(I \neq j, I,j = 1,2,… , s) lambda I  = lambda j (I  = j, I, j = 1, 2,… ,s)

TTT = deg[m(λ)]deg[m(lambda)]deg[m(λ)]

F and phalanx function f (A) (A) f (A) is A square matrix power series of convergence ∑ k = 0 up akAk \ sum_ {k = 0} ^ {\ infty} a_kA ^ {k} ∑ k = 0 up akAk and function, level


f ( A ) = k = 0 up a k A k f(A)=\sum_{k=0}^{\infty}a_kA^{k}

set


T ( Lambda. ) = b 0 + b 1 Lambda. + . . . + b t 1 Lambda. t 1 T(\lambda)=b_0+b_1\lambda+… +b_{t-1}\lambda^{t-1}

make

There are


T ( A ) = f ( A ) = k = 0 up a k A k T(A)=f(A)=\sum_{k=0}^{\infty}a_kA^k


Example: Suppose A=[010−2]A=\begin{bmatrix} 0&1 \\ 0&-2 \end{bmatrix}A=[001−2] and calculate eAze^{Az}eAz

Example: Suppose A=[5−44−3]A=\begin{bmatrix} 5&-4 \\ 4&-3 \end{bmatrix}A=[54−4−3] and calculate A100A^{100}A100

conclusion

Description:

  • Refer to matrix Theory in your textbook
  • With the book concept explanation combined with some of their own understanding and thinking

The essay is just a study note, recording a process from 0 to 1

Hope to have a little help to you, if there is a mistake welcome small partners correct