Homogeneous systems
$$b_{11}x_1 + b_{12}x_2 + \dots + b_{1n}x_n = 0 \\
b_{21}x_1 + b_{22}x_2 + \dots + b_{2n}x_n = 0 \\
\dots \\
b_{p1}x_1 + b_{p2}x_2 + \dots + b_{pn}x_n = 0$$
Properties:
Properties:
- Has at least one 1 solution [0, 0, ..., 0]
- If it has a non-zero solution, then it has infinite number of solutions.
Inverse of a matrix
A matrix \(A\) has an inverse if one of the following holds:
- \(det(A)\neq 0\)
- The reduced form of A is the identity matrix.
- \(A\) has full rank.
- The homogeneous equation \(Ax=0\) has a unique solution.
- The equation \(Ax = b\) has a unique solution for every b.
Eigenvalues and eigenvectors
An eigenvector is a non-zero vector which is transformed to a scalar multiple of itself.
The eigenvalue equation for a matrix \(A\) is \(Av-\lambda v = 0\), which is equivalent to \((A - \lambda I)v = 0\).
This equation only has non-zero solutions if and only if \(det(A - \lambda I) = 0\), i.e.\((A - \lambda I)\) is singular and not invertible.Showing that an eigenbasis makes for good coordinate systems:
Theorems
Number of eigenvalues of a matrix
Suppose that A is a square matrix of size n with distinct eigenvalues \(\lambda_1, \lambda_2, \lambda_3,\dots,\lambda_k\). Then \(\sum_{i=1}^k\alpha_A(\lambda_i) = n\), where \(\alpha_A(\lambda_i)\) is the algebraic multiplicity of \(\lambda_i\).
Maximum number of eigenvalues of a matrix
Suppose that A is a square matrix of size n. Then A cannot have more than n distinct eigenvalues.Spectral theorem
Consider a Hermitian map A on a finite-dimensional real or complex inner product space V endowed with a positive definite Hermitian inner product. The Hermitian condition means
An equivalent condition is that A* = A where A* is the hermitian conjugate of A. In the case that A is identified with an Hermitian matrix (one which is equal to its own conjugate transpose), the matrix of A* can be identified with its conjugate transpose. If A is a real matrix, this is equivalent to AT = A (that is, A is a symmetric matrix).
Theorem. There exists an orthonormal basis of V consisting of eigenvectors of A. Each eigenvalue is real.
So in less precise terms, and considering only real numbers: if A is a symmetric matrix, its eigenvectors form an orthonormal basis.
So in less precise terms, and considering only real numbers: if A is a symmetric matrix, its eigenvectors form an orthonormal basis.
Principal component analysis
One of the applications involving eigenvalues and eigenvectors is PCA. It transforms a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The first principal component has the largest possible variance (that is, accounts for as much as of the variability in the data as possible).
PCA is used in the eigenface technique for face recognition.
One question one would ask is why the eigenvectors of the covariance matrix \(\textbf{V}\) of the data are the principal component. So by definition, the first principal component \(\textbf{w}\) has the largest possible variance. If we project all the data in to this direction. The variance of the resultant data is \(\textbf{w}^T\textbf{Vw}\). So we want to choose a unit vector \(\textbf{w}\) to maximize the variance. Note that we need to constrain the maximization otherwise there is no maximum point of the objective function. The constraint is \(\textbf{w}\) is a unit vector so that \(\textbf{w}^T\textbf{w} = 1\). To do constrained optimization, we need to use Lagrange multiplier. The Lagrange function is thus
\begin{align}L(\textbf{w}, \lambda) &= \textbf{w}^T\textbf{w} - \lambda(\textbf{w}^T\textbf{w} - 1)\\
\frac{\partial u}{\partial \textbf{w}} &= 2\textbf{Vw} - 2\lambda\textbf{w} \\
\textbf{Vw} &= \lambda\textbf{w} \\
\textbf{AA}^T\textbf{w} &= \lambda\textbf{w}
\end{align}
This means the maximizing vector will be the eigenvector with the largest eigenvalue. The principal component vector is also a linear combination of the original variables.
$$M = U\Sigma V^*$$
PCA is used in the eigenface technique for face recognition.
One question one would ask is why the eigenvectors of the covariance matrix \(\textbf{V}\) of the data are the principal component. So by definition, the first principal component \(\textbf{w}\) has the largest possible variance. If we project all the data in to this direction. The variance of the resultant data is \(\textbf{w}^T\textbf{Vw}\). So we want to choose a unit vector \(\textbf{w}\) to maximize the variance. Note that we need to constrain the maximization otherwise there is no maximum point of the objective function. The constraint is \(\textbf{w}\) is a unit vector so that \(\textbf{w}^T\textbf{w} = 1\). To do constrained optimization, we need to use Lagrange multiplier. The Lagrange function is thus
\begin{align}L(\textbf{w}, \lambda) &= \textbf{w}^T\textbf{w} - \lambda(\textbf{w}^T\textbf{w} - 1)\\
\frac{\partial u}{\partial \textbf{w}} &= 2\textbf{Vw} - 2\lambda\textbf{w} \\
\textbf{Vw} &= \lambda\textbf{w} \\
\textbf{AA}^T\textbf{w} &= \lambda\textbf{w}
\end{align}
This means the maximizing vector will be the eigenvector with the largest eigenvalue. The principal component vector is also a linear combination of the original variables.
Singular value decomposition
Singular value decomposition can be expressed asThe column of V are eigenvectors of \(M^*M\). If M is positive semi-definite, the eigenvalue decomposition of M is the same as singular value decomposition. However, the eigenvalue decomposition and the singular value decomposition differ for all other matrices M.
No comments :
Post a Comment