Theorem. Let $n \in \mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. The matrix $A$ is diagonalizable if and only if there exists a basis of $\mathbb{R}^n$ which consists of eigenvectors of $A.$
Recall that the eigenspaces corresponding to these three eigenvalues are \[ \operatorname{Nul}( A - 5 I_4), \quad \operatorname{Nul}( A - 3 I_4), \quad \operatorname{Nul}( A - 1 I_4). \] It follows form item a. in Theorem 7 that \begin{alignat*}{2} 1 &\leq \dim \operatorname{Nul}( A - 5 I_4) & & \leq 2, \\ 1 &\leq \dim \operatorname{Nul}( A - 3 I_4) & & \leq 1, \\ 1 &\leq \dim \operatorname{Nul}( A - 1 I_4) & & \leq 1. \end{alignat*} Consequently, \[ \dim \operatorname{Nul}( A - 3 I_4) = 1, \quad \dim \operatorname{Nul}( A - 1 I_4 ) = 1 \] In other words, each of these two eigenspaces is spanned by a single eigenvector.
For the eigenspace \(\operatorname{Nul}( A - 5 I_4)\) we have two options \[ \dim \operatorname{Nul}( A - 5 I_4) = 1, \quad \text{or} \quad \dim \operatorname{Nul}( A - 5 I_4) = 2. \]
By item b. in Theorem 7: The given matrix \(A\) is diagonalazable if and only if the sum of the dimensions of the eigenspaces equals \(4\).
Since \[ 2 + 1 + 1 = 4, \] the given matrix \(A\) is diagonalizable if and only if \[ \dim \operatorname{Nul}( A - 5 I_4) = 2. \] Recall that the dimension of a null space of a matrix is the number of nonpivot columns in the RREF of that matrix. See Section 4.5 page 230 So, we need to choose \(x \in \mathbb{R}\) such that the Reduced Row Echelon Form of the matrix \[ \begin{bmatrix} 5 & x & -2 & 1 \\ 0 & 3 & x & 2 \\ 0 & 0 & 5 & 3 \\ 0 & 0 & 0 & 1 \end{bmatrix} - 5 \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 0 & x & -2 & 1 \\ 0 & -2 & x & 2 \\ 0 & 0 & 0 & 3 \\ 0 & 0 & 0 & -4 \end{bmatrix} \] has two free variables.
A simpler problem in the same spirit is as follows: Consider the matrix \[ \begin{bmatrix} 1 & x & 4 \\ 0 & 2 & x \\ 0 & 0 & 1 \end{bmatrix} \] where \(x \in \mathbb{R}\). Find all pairs $x \in \mathbb{R}$ such that the matrix $A$ is diagonalizable.
For this matrix, you can choose a specific value of \(x\), such as \(x=0\) and calculate the eigenvalues and eigenvectors to assess whether the matrix is diagonalizable. Next, you can determine the values of \(x\) that make the matrix diagonalizable and compute its diagonalization. While these tasks can be performed manually for the given \(4\times 4\) matrices, they are significantly more time-consuming to complete by hand.
Below I want to present a change of coordinates matrix in a vector space of polynomials which requires only the Binomial Theorem. The Binomial Theorem is the theorem that you might have seen in a college algebra class: \begin{align*} (u+v)^1 & = u+v, \\ (u+v)^2 & = u^2+2\mkern 2mu u v + v^2, \\ (u+v)^3 & = u^3+ 3\mkern 2mu u^2 v + 3\mkern 2mu u v^2 + v^3,\\ (u+v)^4 & = u^4+ 4\mkern 2mu u^3 v + 6\mkern 2mu u^2 v^2 + 4\mkern 2mu u v^3 + v^4,\\ (u+v)^5 & = u^5+ 5\mkern 2mu u^4 v + 10\mkern 2mu u^3 v^2 + 10\mkern 2mu u^2 v^3 + 5\mkern 2mu u v^4 + v^5, \end{align*} and so on.
We do not need the general version of the Binomial Theorem here. But, since we mentioned it I write more about it in the last item in today's post.
We do not need the general version of the Binomial Theorem here, but, since we mentioned it I want to introduce you to the important concepts related to the Binomial Theorem. Those are the concept of factorial, the concept of a binomial coefficient, and, most importantly, the concept of recursion.
In general, if $n \in \mathbb{N}$ we have \begin{align*} (u+v)^n & = \sum_{k=0}^n \binom{n}{k} \mkern 2mu u^{n-k} v^k \\ & = u^n + n \mkern 2mu u^{n-1} v + \frac{n(n-1)}{2}\mkern 2mu u^{n-2} v^2 + \cdots + \frac{n(n-1)}{2}\mkern 2mu u^{2} v^{n-2} + n\mkern 2mu u v^{n-1} + v^n. \end{align*}
In the above formula, for \( n, k \in \{0\}\cup\mathbb{N}\) with \(k \leq n \), the symbol \( \displaystyle \binom{n}{k} \) (read as "n choose k") denotes the Binomial coefficient. The definition is:
\[ \binom{n}{k} = \frac{n!}{k! \, (n-k)!}, \]
where for \( m \in \mathbb{N} \), \( m! \) (read as "m factorial") is the product of all positive integers up to \( m \). By convention \( 0! = 1 \).
The Base Case: \(0!=1\)
The Recursive Step: For all \(m\in\mathbb{N}\) we set \(m! = \bigl( (m-1)! \bigr) \mkern 2px m\).
For more details, see Factorial.
The recursive definition of the binomial coefficients is as follows:
The Base Case: \begin{equation*} \text{For all} \ \ n \in \{0\}\cup\mathbb{N} \quad \text{we set} \quad \binom{n}{0} = 1 \quad \text{and} \quad \binom{n}{n} = 1. \end{equation*} The Recursive Step: \begin{equation*} \text{For all} \ \ n \in \mathbb{N} \ \ \text{and} \ \ k \in \{1,\ldots,n\} \quad \text{we set} \quad \binom{n+1}{k} = \binom{n}{k-1} + \binom{n}{k}. \end{equation*} At each line below, the recursive step with specific values for \(n\) and \(k\) and the previously evaluated values (that is why it is called a recursion, see the next item below) for the binomial coefficients yields:
\begin{alignat*}{2} &\text{For } n=2, \ k=1 \qquad &&\binom{2}{1} = \binom{1}{0} + \binom{1}{1} = 1 + 1 = 2, \\ &\text{For } n=3, \ k=1 &&\binom{3}{1} = \binom{2}{0} + \binom{2}{1} = 1 + 2 = 3, \\ &\text{For } n=3, \ k=2 &&\binom{3}{2} = \binom{2}{1} + \binom{2}{2} = 2 + 1 = 3, \\ &\text{For } n=4, \ k=1 &&\binom{4}{1} = \binom{3}{0} + \binom{3}{1} = 1 + 3 = 4, \\ &\text{For } n=4, \ k=2 &&\binom{4}{2} = \binom{3}{1} + \binom{3}{2} = 3 + 3 = 6, \\ &\text{For } n=4, \ k=3 &&\binom{4}{3} = \binom{3}{2} + \binom{3}{3} = 3 + 1 = 4, \\ &\text{For } n=5, \ k=1 &&\binom{5}{1} = \binom{4}{0} + \binom{4}{1} = 1 + 4 = 5, \\ &\text{For } n=5, \ k=2 &&\binom{5}{2} = \binom{4}{1} + \binom{4}{2} = 4 + 6 = 10, \\ &\text{For } n=5, \ k=3 &&\binom{5}{3} = \binom{4}{2} + \binom{4}{3} = 6 + 4 = 10, \\ &\text{For } n=5, \ k=4 &&\binom{5}{4} = \binom{4}{3} + \binom{4}{4} = 4 + 1 = 5, \\ & & & \quad \quad \mkern 12px \vdots \end{alignat*}
For more details about this recursion, see Pascal's triangle.
The most important tool when working with finite-dimensional abstract vector spaces is the concept of a coordinate mapping introduced in Section 4.4 on page 221. Theorem 8 on page 221 and Problems 23-26 on page 225 provide theoretical background on how a coordinate mapping works. How to use a coordinate mapping is explained in Examples 5 and 6.
To use a coordinate mapping on a vector space we need to know a basis for that vector space.
The standard basis for the vector space $\mathbb{P}_3$ of polynomials is the set of all monomials: \[ \mathcal{M} =\bigl\{ 1, \ x, \ x^2, \ x^3 \bigr\}. \] The corresponding coordinate mapping is \[ \bigl[a_0 + a_1 x + a_2 x^2 + a_3 x^3 \bigr]_{\mathcal{M}} = \left[\!\begin{array}{c} a_0 \\ a_1 \\ a_2 \\ a_3 \end{array}\!\right] \in \mathbb{R}^4. \]
The standard basis for the vector space $\mathbb{R}^{2\times 2}$ of $2\!\times\!2$ matrices is the set of matrices: \[ \mathcal{S} = \left\{ \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 0 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 0 \\ 1 & 0 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 0 \\ 0 & 1 \end{array}\!\right] \right\}. \] The corresponding coordinate mapping is \[ \Biggl[ \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right] \Biggr]_{\mathcal{S}} = \left[\!\begin{array}{c} a \\ b \\ c \\ d \end{array}\!\right] \in \mathbb{R}^4. \]
Here we found out that the identity matrix $I_2$ commutes with $A$, which is trivial. The identity matrix commutes with any matrix. Not only that, scaled identity matrix commutes with any matrix. We also "discovered" that the matrix \(A\) commutes with the matrix \(A\). This is nothing new, every square matrix commutes with itself.
The novelty here is that we discovered that every matrix which commutes with \( \left[\!\begin{array}{cc} 0 & 1 \\ 2 & 3 \end{array}\!\right]\) is a linear combination of \( \left[\!\begin{array}{cc} 0 & 1 \\ 2 & 3 \end{array}\!\right]\) and \( \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\!\right]\).
I conjecture that what we discovered for the specific matrix \( \left[\!\begin{array}{cc} 0 & 1 \\ 2 & 3 \end{array}\!\right]\) is true for every nonzero \(2\times 2\) matrix \(A\) which is not a multiple of identity.
Conjecture. If \(A\) is a nonzero \(2\times 2\) matrix such that \(A \neq a I_2\) for all \(a\in \mathbb{R}\), then \[ \mathcal{C}_A = \bigl\{X \in \mathbb{R}^{2\times 2} : AX = XA \bigr\} = \operatorname{Span}\bigl\{A, I_2\bigr\} \]
In Example 5 above, let \(\mathbb{D} = \mathbb{R}\), the set of real numbers. That is, let \(\mathcal{V}\) be the set of all real-valued functions defined on \(\mathbb{R}\). (Notice that my notation for a vector space is the calligraphic uppercase \(\mathcal{V}\). I choose calligraphic uppercase letters for vector spaces since uppercase letters are reserved for matrices and transformations.)
Problem. Consider the following subset of \(\mathcal{V}\): \[ \mathcal{S}_1 = \Bigl\{ \mathbf{f} \in \mathcal{V} : \text{for some} \ \ a,b \in \mathbb{R} \ \ \text{we have} \ \ \mathbf{f}(t) = a \sin(t + b) \Bigr\}. \] Prove that \(\mathcal{S}_1\) is a subspace and determine its dimension.
Which functions are in $\mathcal{S}_1?$ For example, with $a=0$ and $b=0$, the function $\mathbf{f}(t) = 0$ for all \(t\in \mathbb{R}\) is in $\mathcal{S}_1?$. With $a=1$ and $b=0$, the function $\sin(t)$ is in the set $\mathcal{S}_1$. With $a=1$ and $b=\pi/2$, the function $\sin(t+\pi/2) = \cos(t)$ is in the set $\mathcal{S}_1$. One can continue with specific values with $a$ and $b$ and plot a few individual functions. However, using technology one can plot many functions in $\mathcal{S}_1$.
Below I present 180 functions from $\mathcal{S}_1$ with the coefficients \begin{align*} a & \in \left\{\frac{1}{6}, \frac{1}{3}, \frac{1}{2}, \frac{2}{3}, \frac{5}{6}, 1, \frac{7}{6}, \frac{4}{3}, \frac{3}{2}, \frac{5}{3}, \frac{11}{6},2, \frac{13}{6}, \frac{7}{3}, \frac{5}{2} \right\}, \\ b & \in \left\{ 0, \frac{\pi}{6},\frac{\pi}{3},\frac{\pi}{2},\frac{2\pi}{3}, \frac{5\pi}{6}, \pi, \frac{7\pi}{6},\frac{4\pi}{3},\frac{3\pi}{2},\frac{5\pi}{3}, \frac{11\pi}{6} \right\}. \end{align*}
Place the cursor over the image to see individual functions.
The inclusion is proved by proving that every element of the set on the left is an element of the set on the right.
Let \(\mathbb{f} \in \mathcal{S}_1\) be arbitrary. By the definition of \(\mathcal{S}_1\) there exist \(a,b \in \mathbb{R}\) such that \[ \mathbf{f}(t) = a \sin(t + b). \] Recall Angle sum and difference identities on Wikipedia, specifically \[ \sin(x+y) = \sin(x) \cos(y) + \cos(x) \sin(y). \] Using this identity we have \begin{align*} \mathbf{f}(t) & = a \sin(t + b) \\ & = a \bigl( \sin(t) \cos(b) + \cos(t) \sin(b) \bigr) \\ & = \bigl(a \cos(b) \bigr) \sin(t) + \bigl(a \sin(b) \bigr) \cos(t) \end{align*}
Setting \(\alpha = a \cos(b)\) and \(\beta = a \sin(b)\) we get \[ \mathbf{f}(t) = \alpha \sin(t) + \beta \cos(t); \] that is \(\mathbf{f}(t)\) is a linear combination of \(\sin(t)\) and \(\cos(t)\). This proves that \[ \mathbf{f} \in \operatorname{Span}\Bigl\{ \sin(t), \cos(t) \Bigr\}. \] Since \(\mathbb{f} \in \mathcal{S}_1\) was arbitrary, this proves the inclusion \[ \mathcal{S}_1 \subseteq \operatorname{Span}\Bigl\{ \sin(t), \cos(t) \Bigr\}. \]
Next we prove the inclusion: \[ \operatorname{Span}\Bigl\{ \sin(t), \cos(t) \Bigr\} \subseteq \mathcal{S}_1. \]
Let \(\mathbf{f}(t)\) be an arbitrary element in \(\operatorname{Span}\Bigl\{ \sin(t), \cos(t) \Bigr\}\). Then there exist real numbers \(\alpha\) and \(\beta\) such that \[ \mathbf{f}(t) = \alpha \sin(t) + \beta \cos(t). \] If \(\alpha = 0\) and \(\beta = 0\), then we can take \(a = 0\) and \(b=0\) and we have \[ \mathbf{f}(t) = 0 \sin(t) + 0 \cos(t) = 0 \sin(t + 0). \] Therefore \(\mathbf{f} \in \mathcal{S}_1\) in this case.
Now we assume that \(\alpha \neq 0\) or \(\beta \neq 0\). Then \(\alpha^2 + \beta^2 \gt 0\).
At this point the proof uses the unit circle definition of sine and cosine which states: If \(x\) and \(y\) are real numbers such that \(x^2 + y^2 = 1\), then there exists a real number \(\theta\) such that \[ x = \cos(\theta), \quad y = \sin(\theta). \] See Unit circle definition of sine and cosine on Wikipedia.
We use the preceding definition of sine and cosine with \[ x = \frac{\alpha}{\sqrt{\alpha^2 + \beta^2}}, \quad y = \frac{\beta}{\sqrt{\alpha^2 + \beta^2}}. \] Then, \[ x^2 + y^2 = \left(\frac{\alpha}{\sqrt{\alpha^2 + \beta^2}}\right)^2 + \left(\frac{\beta}{\sqrt{\alpha^2 + \beta^2}}\right)^2 = \frac{\alpha^2}{\alpha^2 + \beta^2} + \frac{\beta^2}{\alpha^2 + \beta^2} = 1. \] Consequently, there exists \(\theta \in \mathbb{R}\) such that \[ \cos(\theta) = \frac{\alpha}{\sqrt{\alpha^2 + \beta^2}}, \quad \sin(\theta) = \frac{\beta}{\sqrt{\alpha^2 + \beta^2}}. \]
Using the preceding paragraph we have \begin{align*} \mathbf{f}(t) & = \alpha \sin(t) + \beta \cos(t) \\ & = \sqrt{\alpha^2 + \beta^2} \left( \frac{\alpha}{\sqrt{\alpha^2 + \beta^2}} \sin(t) + \frac{\beta}{\sqrt{\alpha^2 + \beta^2}} \cos(t) \right) \\ & = \sqrt{\alpha^2 + \beta^2} \Bigl( \cos(\theta) \sin(t) + \sin(\theta) \cos(t) \Bigr) \\ & = \sqrt{\alpha^2 + \beta^2} \ \sin(t+\theta). \end{align*} Setting \(a = \sqrt{\alpha^2 + \beta^2}\) and \(b = \theta\) we proved that \[ \mathbf{f}(t) = a \sin(t + b). \] Thus we proved that \(\mathbf{f} \in \mathcal{S}_1\) and this proves that \[ \operatorname{Span}\Bigl\{ \sin(t), \cos(t) \Bigr\} \subseteq \mathcal{S}_1. \]
To prove that \(\Bigl\{ \sin(t), \cos(t) \Bigr\}\) is a basis for \(\mathcal{S}_1\), we need to prove that \(\sin(t)\) and \(\cos(t)\) are linearly independent. For that we need to prove the implication: \[ \alpha \sin(t) + \beta \cos(t) = 0 \quad \text{for all} \quad t \in \mathbb{R} \] implies \(\alpha = 0\) and \(\beta = 0\).
To prove the last implication, assume \[ \alpha \sin(t) + \beta \cos(t) = 0 \quad \text{for all} \quad t \in \mathbb{R}. \] Setting \(t= 0\) we get \[ 0 = \alpha \sin(0) + \beta \cos(0) = \alpha \, 0 + \beta \, 1 = \beta, \] proving that \(\beta = 0\). Setting \(t = \pi/2\) we get \[ 0 = \alpha \sin(\pi/2) + \beta \cos(\pi/2) = \alpha \, 1 + \beta \, 0 = \alpha, \] proving that \(\alpha = 0\). Thus we proved that \(\alpha = 0\) and \(\beta = 0\). This proves that \(\sin(t)\) and \(\cos(t)\) are linearly independent. Therefore \(\Bigl\{ \sin(t), \cos(t) \Bigr\}\) is a basis for \(\mathcal{S}_1\), Thus, \[ \dim \mathcal{S}_1 = 2. \]
Assume that $\alpha_0,$ $\alpha_1,$ $\alpha_2,$ and $\alpha_3$ are scalars in $\mathbb{R}$ such that
\begin{equation} \tag{G1}
\require{bbox}
\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0\cdot 1 + \alpha_1 x + \alpha_2 x^2+ \alpha_3 x^3 =0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
The objective here is to prove
\[
\bbox[5px, #FF6666, border: 1pt solid red]{\alpha_0 = 0, \quad \alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}.
\]
Here is a Proof: Step 1. The green identity labeled (G1) holds for all real numbers $x$ in \(\mathbb{R}\). Therefore we can substitute \(x=0\) in the green identity (G1) and we get \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0}\). Since we proved \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0}\), we can substrate \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0}\) in the green identity (G1) and we get
\begin{equation} \tag{G2}
\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 x + \alpha_2 x^2+ \alpha_3 x^3 =0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
Step 2. Take the derivative of the both sides of the equality in the identity (G2) and we get
\begin{equation} \tag{G3}
\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 + 2 \alpha_2 x+ 3 \alpha_3 x^2 =0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
The green identity (G3) holds for all real numbers $x$ in \(\mathbb{R}\). Therefore we can substitute \(x=0\) in the green identity (G3) and we get \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 = 0}\). Since we proved \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 = 0}\), we can substrate \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 = 0}\) in the green identity (G3) and we get
\begin{equation} \tag{G4}
\bbox[5px, #88FF88, border: 1pt solid green]{2\alpha_2 x + 3 \alpha_3 x^2 =0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
Step 3. Take the derivative of the both sides of the equality in the identity (G4) and we get
\begin{equation} \tag{G5}
\bbox[5px, #88FF88, border: 1pt solid green]{2 \alpha_2 + 6 \alpha_3 x = 0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
The green identity (G5) holds for all real numbers $x$ in \(\mathbb{R}\). Therefore we can substitute \(x=0\) in the green identity (G5) and we get \(\bbox[5px, #88FF88, border: 1pt solid green]{2\alpha_2 = 0}\). Multiplying both sides by \(1/2\) we get: \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_2 = 0}\). Since we proved \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_2 = 0}\), we can substrate \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_2 = 0}\) in the green identity (G5) and we get
\begin{equation} \tag{G6}
\bbox[5px, #88FF88, border: 1pt solid green]{6\alpha_3 x = 0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
Step 4. The green identity (G6) holds for all real numbers $x$ in \(\mathbb{R}\). Therefore we can substitute \(x=1\) in the green identity (G6) and we get \(\bbox[5px, #88FF88, border: 1pt solid green]{6\alpha_3 = 0}\). Multiplying both sides by \(1/6\) we get: \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_3 = 0}\).
Conclusion. Using repeated differentiation and substitution, we proved that the green identity (G1) implies
\[
\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0, \quad \alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}.
\]
In other words, we have greenified the red objective of the proof. This completes the proof.
Assume that $\alpha_0,$ $\alpha_1,$ $\alpha_2,$ and $\alpha_3$ are scalars in $\mathbb{R}$ such that
\begin{equation} \tag{G1}
\require{bbox}
\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0\cdot 1 + \alpha_1 x + \alpha_2 x^2+ \alpha_3 x^3 =0 \quad \text{for all} \quad x \in \mathbb{R}}.
\end{equation}
The objective here is to prove
\[
\bbox[5px, #FF6666, border: 1pt solid red]{\alpha_0 = 0, \quad \alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}.
\]
Here is a Proof: The green identity labeled (G1) holds for all real numbers $x$ in \(\mathbb{R}\). Therefore we can substitute the following four values for \(x\): \(x = 0, 1, -1, 2\) in the green identity (G1). Then we get the following four linear equations for the unknowns \(\alpha_0, \alpha_1, \alpha_2, \alpha_3\):
\[
\bbox[5px, #88FF88, border: 1pt solid green]{
\begin{array}{lr}
\alpha_0 & = 0 \\
\alpha_0 + \alpha_1 + \alpha_2 + \alpha_3 &=0 \\
\alpha_0 - \alpha_1 + \alpha_2 - \alpha_3 &=0 \\
\alpha_0 + 2\alpha_1 + 4\alpha_2 +8\alpha_3 &=0
\end{array}
}
\]
The last green box contains a homogeneous system of four linear equations with four unknowns. The first equation gives \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0}\). Substituting \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0}\) in the remaining three equations yields
\[
\bbox[5px, #88FF88, border: 1pt solid green]{
\begin{array}{lr}
\alpha_1 + \alpha_2 + \alpha_3 &=0 \\
- \alpha_1 + \alpha_2 - \alpha_3 &=0 \\
2\alpha_1 + 4\alpha_2 +8\alpha_3 &=0
\end{array}
}
\]
Replacing the second equation with the sum of the first two equations and replacing the third equation with the sum of the third equation and the multiple of the first equation by \(-2\) yields the equivalent system
\[
\bbox[5px, #88FF88, border: 1pt solid green]{
\begin{array}{lr}
\alpha_1 + \alpha_2 + \alpha_3 & = 0 \\
\phantom{\alpha_1 +} 2 \alpha_2 & = 0 \\
\phantom{\alpha_1 +} 2\alpha_2 +6\alpha_3 &=0
\end{array}
}
\]
Multiplying the second equation by \(1/2\) yields \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_2 = 0}\). Substituting \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_2 = 0}\) in the first and third equation yields the equivalent system
\[
\bbox[5px, #88FF88, border: 1pt solid green]{
\begin{array}{lr}
\alpha_1 + \alpha_3 & = 0 \\
\phantom{\alpha_1 +\,} 6\alpha_3 & = 0
\end{array}
}
\]
Multiplying the second equation by \(1/6\) yields \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_3 = 0}\). Substituting \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_3 = 0}\) in the first equation we get \(\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 = 0}\).
In conclusion, we proved:
\[
\bbox[5px, #88FF88, border: 1pt solid green]{\alpha_0 = 0, \quad\alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}.
\]
In this way we have greenifyed the red statement. That is, we proved it.
Now that we know that $\mathcal{M}_3 = \bigl\{ 1, x, x^2, x^3 \bigr\}$ is a basis for $\mathbb{P}_3$, we can use the coordinate mapping relative to this basis.
The concept of a coordinate mapping is introduced in Section 4.4 on page 221. The most important fact is in Theorem 8 on page 221. This theorem states that a coordinate mapping is a bijection and linear transformation. A proof of Theorem 8 is outlined in Problems 23-26 on page 225. Pay attention to the paragraph before Examples 5. How to use a coordinate mapping is explained in Examples 5 and 6.
The standard basis for the vector space $\mathbb{P}_3$ of polynomials is the set of all monomials: \[ \mathcal{M}_3 =\bigl\{ 1, \ x, \ x^2, \ x^3 \bigr\}. \] The corresponding coordinate mapping is \[ \Bigl[a_0 + a_1 x + a_2 x^2 + a_3 x^3 \Bigr]_{\mathcal{M}_3} = \left[\!\begin{array}{c} a_0 \\ a_1 \\ a_2 \\ a_3 \end{array}\!\right] \in \mathbb{R}^4. \] For example, \[ \Bigl[(x-1)^3 \Bigr]_{\mathcal{M}_3} = \left[\!\begin{array}{r} -1 \\ 3 \\ -3 \\ 1 \end{array}\!\right] \in \mathbb{R}^4. \]
Above we expressed $\mathcal{Z}_1$ as a span of three polynomials. By Theorem 1 in Section 4.1 each span is a subspace. Therefore, this is an alternative proof that $\mathcal{Z}_1$ is a subspace.
Now we will prove that the polynomials $x-1, x^2-1, x^3 - 1$ are linearly independent. For that we will use the coordinate mapping relative to the standard basis \(\mathcal{M}_3 = \bigl\{1,x,x^2,x^3\bigr\}\). We have \[ \Bigl[ x- 1 \Bigr]_{\mathcal{M}_3} = \left[\!\begin{array}{r} -1 \\ 1 \\ 0 \\ 0 \end{array}\!\right], \quad \Bigl[ x^2 - 1 \Bigr]_{\mathcal{M}_3} = \left[\!\begin{array}{r} -1 \\ 0 \\ 1 \\ 0 \end{array}\!\right], \quad \Bigl[ x^3 - 1 \Bigr]_{\mathcal{M}_3} = \left[\!\begin{array}{r} -1 \\ 0 \\ 0 \\ 1 \end{array}\!\right]. \] The polynomials $x-1, x^2-1, x^3 - 1$ are linearly independent if and only if their coordinate vectors \(\Bigl[ x- 1 \Bigr]_{\mathcal{M}_3}\), \(\Bigl[ x^2- 1 \Bigr]_{\mathcal{M}_3}\), \(\Bigl[ x^3 - 1 \Bigr]_{\mathcal{M}_3}\) are linearly independent (see Example 6 in Section 4.4). Since \[ \left[\!\begin{array}{rrr} -1 & -1 & -1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\!\right] \ \sim \ \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\-1 & -1 & -1 \end{array}\!\right] \ \sim \ \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array}\!\right]. \] Since all the columns in the above RREF are pivot columns, the coordinate vectors \(\Bigl[ x- 1 \Bigr]_{\mathcal{M}_3}\), \(\Bigl[ x^2- 1 \Bigr]_{\mathcal{M}_3}\), \(\Bigl[ x^3 - 1 \Bigr]_{\mathcal{M}_3}\) are linearly independent. Consequently, the polynomials $x-1, x^2-1, x^3 - 1$ are linearly independent. Therefore, \[ \bigl\{ x-1, x^2-1, x^3 - 1 \bigr\} \] is a basis for \(\mathcal{Z}_1\). Therefore \[ \dim \mathcal{Z}_1 = 3. \]
Definition. A nonempty set $\mathcal{V}$ is said to be a vector space over $\mathbb R$ if it satisfies the following ten axioms.
Explanation of the abbreviations: AE--addition exists, AA--addition is associative, AC--addition is commutative, AZ--addition has zero, AO--addition has opposites, SE-- scaling exists, SA--scaling is associative, SD--scaling distributes over addition of real numbers, SD--scaling distributes over addition of vectors, SO--scaling with one.
You:
Can you please write a complete LaTeX file with instructions on using basic mathematical operations, like fractions, sums, integrals, basic functions, like cosine, sine, and exponential function, and how to structure a document and similar features? Please explain the difference between the inline and displayed mathematical formulas. Please include examples of different ways of formatting displayed mathematical formulas. Please include what you think would be useful to a mathematics student. Also, can you please include your favorite somewhat complicated mathematical formula as an example of the power of LaTeX? I emphasize I want a complete file that I can copy into a LaTeX compiler and compile into a pdf file. Please ensure that your document contains the code for the formulas you are writing, which displays both as code separately from compiled formulas. Also, please double-check that your code compiles correctly. Remember that I am a beginner and cannot fix the errors. Please act as a concerned teacher would do.
This is the LaTeX document that ChatGPT produced base on the above prompt. Here is the compiled PDF document.
You can ask ChatGPT for a specific LaTeX advice. To get a good response, think carefully about your prompt. Also, you can offer to ChatGPT a sample of short mathematical writing from the web or a book as a PNG file and it can convert that writing to LaTeX. You can even try with neat handwriting. The results will of course depend on the clarity of the file, ChatGPT makes mistakes, but I found it incredibly useful.
Question. Let $f_n$, with $n \in \{0\}\cup\mathbb{N},$ be the sequence of Fibonacci numbers. Does there exist a continuous function $g:[0,+\infty)\to \mathbb{R}$ such that for every $n \in \{0\}\cup\mathbb{N}$ we have $g(n) = f_n.$ Here it is expected that we give a reasonably simple formula for $g(x)$.
A closed form expression for the Fibonacci numbers. In the preceding items we used eigenvectors of the matrix $\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]$ to deduce the following closed form expression for the Fibonacci numbers: \[ \text{for all} \quad n \in \mathbb{N} \qquad f_{n} = \frac{1}{\sqrt{5}}\Biggl( \biggl(\frac{1+\sqrt{5}}{2}\biggr)^n - \biggl(\frac{1-\sqrt{5}}{2}\biggr)^n \Biggr). \] The difficulty with the recursive formula for the Fibonacci numbers is that we have to calculate all the numbers preceding $f_n$ in order to calculate $f_n.$ The difficulty with the closed form expression for the Fibonacci numbers is that calculating accurate powers \[ \biggl(\frac{1+\sqrt{5}}{2}\biggr)^n \quad \text{and} \quad \biggl(\frac{1-\sqrt{5}}{2}\biggr)^n \] for large values for $n \in\mathbb{N}$, like $n=100$, is difficult.
It is important to mention that the irrational number \[ \varphi = \frac{1+\sqrt{5}}{2} \] is the famous number called Golden Ratio.
We have that \[ \frac{1-\sqrt{5}}{2} = \frac{\bigl(1-\sqrt{5}\bigr)\bigl(1+\sqrt{5}\bigr)}{2\bigl(1+\sqrt{5}\bigr)} = \frac{1-5}{2\bigl(1+\sqrt{5}\bigr)} = - \frac{2}{1+\sqrt{5}} = - \frac{1}{\varphi} = - \varphi^{-1}. \] Therefore, the closed form expression for the Fibonacci numbers can be written as \[ \text{for all} \quad n \in \mathbb{N} \qquad f_{n} = \frac{1}{\sqrt{5}}\Bigl( \varphi^n - (-1)^n \varphi^{-n} \Bigr) \quad \text{where} \quad \varphi = \frac{1+\sqrt{5}}{2}. \]
Theorem. Let $n \in \mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. The matrix $A$ is diagonalizable if and only if there exists a basis of $\mathbb{R}^n$ which consists of eigenvectors of $A.$
Theorem. Let $n \in \mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. The following two statements are equivalent:
(a) There exist an invertible $n\!\times\!n$ matrix $P$ and a diagonal $n\!\times\!n$ matrix $D$ such that $A= PDP^{-1}.$
(b) There exist linearly independent vectors $\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n$ in $\mathbb{R}^n$ and real numbers $\lambda_1, \lambda_2,\ldots,\lambda_n$ such that $A \mathbf{v}_k = \lambda_k \mathbf{v}_k$ for all $k\in \{1,\ldots,n\}.$
Place the cursor over the image to start the animation.
Let \(m \in \mathbb{N}\). Let \(i,j \in \mathbb{N}\) be such that \(i \lt j \lt n\). In this item I want to prove that the determinant of the elementary matrix obtained from the identity matrix by exchanging the positions of \(i\)-th and \(j\)-th row equals to \(-1\).
Below is a "click-by-click" proof of the fact that the determinant of the elementary matrix obtained from the identity matrix by exchanging two rows (or, equivalently two columns) equals to $-1.$ There are nine steps in this proof. I describe each step below.All entries left blank in the determinant below are zeros.
Click on the image for a step by step proof.
This fact follows from the cofactor expansion calculation of a determinant, Theorem 1 in Section 3.1.
Today, I will share several important theorems and aim to explain each proof in full detail. I view mathematical proofs as consisting of three parts: the claim, the background knowledge needed, and the proof itself, which I usually break down into several steps.
As I mentioned earlier, each theorem is an implication that can be summarized as: "If \( p \), then \(q\)," where \( p \) represents the assumptions of the theorem and \(q\) represents the conclusion. To help clarify the structure of proofs, I often color-code the assumptions and relevant background facts in green, a color symbolizing approachability and friendliness, and I color the conclusion in red. This color scheme highlights the initial mystery in the connection between the assumptions (green) and the conclusion (red), emphasizing that the proofer’s task is to bridge this gap by constructing a logical path. In this language of colors, the proofer’s task is to create a green path of logical steps, starting with the assumptions, using the background knowledge as individual stepping stones, and ultimately reaching the red conclusion. In a way, the goal of the proof is to "greenify" the red, completing the logical connection.
Theorem 1. Let $n, p, q \in \mathbb{N}.$ Let $F$ be an $n\!\times\!p$ matrix and let $G$ be an $n\!\times\!q$ matrix. If \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\operatorname{Col}(F) \subseteq \operatorname{Col}(G)} \quad \text{and} \quad \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ F \ \ \text{are linearly independent}}, \end{equation*} then \begin{equation*} \bbox[#FFC0C0, 6px, border:3px solid red]{p \leq q}. \end{equation*}
Background Knowledge is as follows.
BK0. The definition of matrix-vector multiplication, the definition of matrix multiplication.
BK1. Let \(\mathbb{y} \in \mathbb{R}^n\). Then \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\mathbb{y} \in \operatorname{Col}(G) \quad \text{if and only if} \quad \text{there exists} \ \ \mathbb{x} \in \mathbb{R}^q \ \ \text{such that} \ \ \mathbb{y} = G \mathbb{x}}. \end{equation*}
BK2. Let \(H\) be an \(q\times p\) matrix and let \(\mathbb{x} \in \mathbb{R}^p\). Then \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{ \begin{array}{c} \text{the columns of} \ \ H \ \ \text{are linearly independent} \quad \\ \quad\quad\quad\quad \quad \text{if and only if}\quad \quad H\mathbf{x} = \mathbf{0}_q \ \Rightarrow \ \mathbf{x} = \mathbf{0}_p. \end{array}} \end{equation*} Notice that the implication \(H\mathbf{x} = \mathbf{0}_q \ \Rightarrow \ \mathbf{x} = \mathbf{0}_p\) is stated in English as: The homogeneous equation \(H\mathbf{x} = \mathbf{0}_q\) has only the trivial solution.
BK3. Let \(H\) be an \(q\times p\) matrix. The following implication holds: \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{ \text{If the columns of} \ H \ \text{are linearly independent, then} \ \ p \leq q. } \end{equation*} In English, this implication can be stated as: If the columns of a matrix are linearly independent, then the number of rows is greater than or equal to the number of columns.
Proof. Let \(\mathbf{f}_1,\ldots, \mathbf{f}_p \in \mathbb{R}^n\) be the columns of \(F\) and let \(\mathbf{g}_1,\ldots, \mathbf{g}_q \in \mathbb{R}^n\) be the columns of \(G\). That is \[ F = \bigl[ \mathbf{f}_1 \ \cdots \ \mathbf{f}_p \bigr], \qquad G = \bigl[ \mathbf{g}_1 \ \cdots \ \mathbf{g}_q \bigr]. \]
Step 1. Since for every \(j\in\{1,\ldots,p\}\) we have \(\mathbf{f}_j \in \operatorname{Col}(F)\), the assumption \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\operatorname{Col}(F) \subseteq \operatorname{Col}(G)} \end{equation*} implies that \(\mathbf{f}_j \in \operatorname{Col}(G)\). Since for every \(j\in\{1,\ldots,p\}\) we have \(\mathbf{f}_j \in \operatorname{Col}(G)\), by background knowledge BK1 we conclude that for every \(j\in\{1,\ldots,p\}\) there exists \(\mathbf{h}_j \in \mathbb{R}^q\) such that \(\mathbf{f}_j = G\mathbf{h}_j\). Set \[ H = \bigl[ \mathbf{h}_1 \ \cdots \ \mathbf{h}_p \bigr]. \] Then \(H\) is \(q\times p\) matrix and by background knowledge BK0 we have \begin{align*} GH & = G\bigl[ \mathbf{h}_1 \ \cdots \ \mathbf{h}_p \bigr] \\ & = \bigl[ G\mathbf{h}_1 \ \cdots \ G \mathbf{h}_p \bigr] \\ & = \bigl[ \mathbf{f}_1 \ \cdots \ \mathbf{f}_p \bigr] \\ & = F. \end{align*} Thus, there exists an \(q\times p\) matrix \(H\) such that \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{F = GH}. \end{equation*}
Step 2. In this step we will prove that the columns of \(H\) are linearly independent. For that proof we use background knowledge BK2. We will prove that for \(\mathbf{x} \in \mathbb{R}^p\) the following implication holds: \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{H\mathbf{x} = \mathbf{0}_q} \quad \Rightarrow \quad \bbox[#FFC0C0, 6px, border:3px solid red]{\mathbf{x} = \mathbf{0}_p}. \end{equation*} Here is the proof of this implication. (Notice how I start from green and use only known green stuff to arrive at red.) Assume \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{H\mathbf{x} = \mathbf{0}_q}. \end{equation*} Apply matrix \(G\) to both sides of the preceding green equality to get \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{GH\mathbf{x} = G\mathbf{0}_q}. \end{equation*} By Step 1 we have \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{F = GH}. \end{equation*} and by BK0 we have \(G\mathbf{0}_q = \mathbf{0}_n\). Hence \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{F\mathbf{x} = \mathbf{0}_n}. \end{equation*} By assumption \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ F \ \ \text{are linearly independent}}. \end{equation*} By background knowledge BK2 we deduce that \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{F\mathbf{x} = \mathbf{0}_n \quad \Rightarrow \quad \mathbf{x} = \mathbf{0}_p}. \end{equation*} In conclusion, the implication \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{H\mathbf{x} = \mathbf{0}_q} \quad \Rightarrow \quad \bbox[#FFC0C0, 6px, border:3px solid red]{\mathbf{x} = \mathbf{0}_p} \end{equation*} is proved. By background knowledge BK2 the last implication yields that \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ H \ \ \text{are linearly independent}}. \end{equation*}
Step 3. By the final result of Step 2 \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ H \ \ \text{are linearly independent}}. \end{equation*} and by background knowledge BK3 \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{ \text{If the columns of} \ H \ \text{are linearly independent, then} \ \ p \leq q. } \end{equation*} we conclude \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{p \leq q}. \end{equation*}
This completes the proof.
Theorem 1. If two matrices have the same column space and one of them has linearly independent columns, then the matrix with linearly independent columns has no more columns than the other matrix. In particular, if two matrices have the same column space and both have linearly independent columns, then the matrices have the same size.
Theorem 2. With \(n\in\mathbb{N}\), any two bases of a subspace of \(\mathbb{R}^n\) have the same number of elements.
Theorem 2 Let $n, m, k \in \mathbb{N},$ and let $\mathcal{H}$ be a subspace of $\mathbb{R}^n$ such that \(\mathcal{H} \neq \{\mathbf{0}_n\}\). If the following two assumptions are satisfied:
A1. Vectors $\mathbf{a}_1, \ldots, \mathbf{a}_k$ form a basis for $\mathcal{H}$, (notice that there are $k$ vectors in this basis),
A2. Vectors $\mathbf{b}_1, \ldots, \mathbf{b}_m$ form a basis for $\mathcal{H}$, (notice that there are $m$ vectors in this basis),
then the following claim is true:
C. $m=k$.
Background Knowledge is as follows.
BK0. The definition of a basis for a subspace, the definition of a column space.
BK1. Theorem 1.
Proof. Introduce an \(n\times k\) matrix \(A\) whose columns are $\mathbf{a}_1, \ldots, \mathbf{a}_k$ and \(n\times m\) matrix \(B\) whose columns are $\mathbf{b}_1, \ldots, \mathbf{b}_m$. That is \[ A = \bigl[ \mathbf{a}_1 \ \cdots \ \mathbf{a}_k \bigr], \qquad B = \bigl[ \mathbf{b}_1 \ \cdots \ \mathbf{b}_m \bigr]. \]
Step 1. By background knowledge BK0 we have that \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\mathcal{H} = \operatorname{Col}(A) = \operatorname{Col}(B)}, \end{equation*} \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ A \ \ \text{are linearly independent}}, \end{equation*} and \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ B \ \ \text{are linearly independent}}. \end{equation*}
Step 2. In this step we apply BK1, that is Theorem 1 to the matrix \(A\) in the role of \(F\) and \(B\) in the role of \(G\). Since by Step 1 we have \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\operatorname{Col}(A) = \operatorname{Col}(B)} \quad \text{and} \quad \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ A \ \ \text{are linearly independent}}, \end{equation*} we deduce \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{k \leq m}. \end{equation*}
Step 3. In this step we apply BK1, that is Theorem 1 to the matrix \(B\) in the role of \(F\) and \(A\) in the role of \(G\). Since by Step 1 we have \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{\operatorname{Col}(B) = \operatorname{Col}(A)} \quad \text{and} \quad \bbox[lightgreen, 6px, border:3px solid green]{\text{the columns of} \ \ B \ \ \text{are linearly independent}}, \end{equation*} we deduce \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{m \leq k}. \end{equation*}
Step 4. In Step 2 and Step 3 we proved \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{k \leq m} \quad \text{and} \quad \bbox[lightgreen, 6px, border:3px solid green]{m \leq k}. \end{equation*} Consequently \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{k = m}. \end{equation*}
This completes the proof.
The concept of column space of a matrix is introduced inn Section 2.8. The concept of row space of a matrix is introduced in Section 4.6 in subsection entitled The Row Space. You can learn how to find a basis of the row space in An Ode to Reduced Row Echelon Form.
The rank theorem is covered both in Section 2.9 in subsection Dimension of a Subspace and in subsection The Rank Theorem in Section 4.6. Read both.
Suggested problems related to $\operatorname{Col}(A)$ and $\operatorname{Nul}(A)$ of a given matrix $A$ from Section 4.5 are: 3, 6, 12, 13, 15, 18.
More problems about $\operatorname{Col}(A)$, $\operatorname{Row}(A)$ and $\operatorname{Nul}(A)$ of a given matrix $A$ are in Section 4.6: 3, 4, 5, 6, 7, 8, 9, 11, 13, 15, 17, 18, 27, 28, 29.
Suggested problems for Section 2.9: 1, 2, 3, 5, 6, 7, 9, 11, 12, 14, 15, 19, 20, 21, 23, 24, and in particular 27 and 28.
Today we started Section 2.8 Subspaces of $\mathbb{R}^n.$ Suggested problems for Section 2.8: 5, 6, 8, 9, 10, 11-20, 24, 25, 26, 30, 31-36.
Pay attention to the definitions of
Let me summarize the concepts of column space and row space.
Let $m, n \in \mathbb{N}.$ Let $A$ be an $n\!\times\!m$ matrix. The matrix $A$ has $n$ rows and $m$ columns. The columns of $A$ are vectors in $\mathbb{R}^n.$ The transpose of $A$, denoted by \(A^\top\), is an \(m\!\times\!n\) matrix. The matrix \(A^\top\) has $m$ rows and $n$ columns. The columns of \(A^\top\) are vectors in \(\mathbb{R}^m\). The columns of $A^\top$ have the identical entries as the rows of \(A\), just instead of being in rows they are written as columns. Two fundamental subspaces associated with the matrix $A$ are \begin{alignat*}{2} \operatorname{Col}(A) & = \operatorname{Row}(A^\top) \quad & & \left\{ \begin{array}{l} \text{this space is the span of the columns of} \ \ A \\ \text{this space is a subspace of} \ \ \mathbb{R}^n, \\ \text{since each column of $A$ is a vector in} \ \mathbb{R}^n \end{array} \right. \\ \operatorname{Row}(A) &= \operatorname{Col}(A^\top) \quad & & \left\{ \begin{array}{l} \text{this space is the span of the columns of} \ A^\top \\ \text{this space is a subspace of} \ \ \mathbb{R}^m, \\ \text{since each column of $A^\top$ is a vector in} \ \mathbb{R}^m \end{array} \right. \end{alignat*}
Theorem 13.(page 152) Let $m$ and $n$ be positive integers, and let $A$ be an $n\!\times\!m$ matrix. The column space of the matrix \(A\), denoted by \(\operatorname{Col}(A)\), is a subspace of \(\mathbb{R}^n\). The pivot columns of the matrix \(A\) form a basis for the column space of \(A\).
Theorem 13.(page 233) Let $m$ and $n$ be positive integers, and let $A$ be an $n\!\times\!m$ matrix. The row spaces of the matrix \(A\), denoted by \(\operatorname{Row}(A)\), is a subspace of \(\mathbb{R}^m\). The row spaces of the matrix \(A\) and the row space of its Reduced Row Echelon Form are the same. The nonzero rows of the Reduced Row Echelon Form of \(A\) form a basis for the row space of \(A\).
I posted a proof of this implication on Saturday. I post it again because of its importance and since I think that the proof here is better than the proof given in the textbook; see Theorem 7.
The proof presented below uses only two facts: elementary matrices are invertible and matrix multiplication is associative. I also like that the proof below proves the invertibility by using the definition of invertibility. The proof below was suggested to me by a student during Fall Quarter 2021. Unfortunately I forgot who it was. In any case, please think on your own about each proof. You can come up with new proofs.
In the proof below we need to prove that a matrix is invertible. For that it is useful to recall the definition of invertibility.
Step | the row operation | the elementary matrix | the inverse of elementary matrix |
---|---|---|---|
1st | The second row is replaced by the the sum of the second row and the third row multiplied by (-1) | $E_1 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1 \end{array}\right]$ | $E_1^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right]$ |
2nd | The second row scaled (multiplied) by (-1) | $E_2 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right]$ | $E_2^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right]$ |
3rd | The third row is replaced by the the sum of the third row and the first row multiplied by $(-2)$ | $E_3 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array}\right]$ | $E_3^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array}\right]$ |
4th | The first row is replaced by the sum of the first row and the second row multiplied by (-1) | $E_4 = \left[\!\begin{array}{rrr} 1 & -1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]$ | $E_4^{-1} = \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]$ |
5th | The first row and the second row are interchanged | $E_5 = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right]$ | $E_5^{-1} = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right]$ |
In the preceding items we establishe the equality: \[ A = E_1^{-1} E_2^{-1} E_3^{-1} E_4^{-1} E_5^{-1}. \] This is interesting, we represented \(A\) as a product of elementary matrices.
Let us verify this claim: \begin{align*} E_5^{-1} & = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right], \\ E_4^{-1} E_5^{-1} & = \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] = \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right], \\ E_3^{-1} \bigl(E_4^{-1} E_5^{-1}\bigr) & =\left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right], \\ E_2^{-1} \bigl(E_3^{-1} E_4^{-1} E_5^{-1}\bigr) & = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right], \\ E_1^{-1} \bigl(E_2^{-1} E_3^{-1} E_4^{-1} E_5^{-1}\bigr) & = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] \end{align*} Thus, we confirmed that \[ A = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] \]After reading this post you should be able to solve a problem stated as follows:
Consider the matrix $M = \left[\begin{array}{rrr} 3 & 3 & 2 \\ 3 & 2 & 1 \\ 2 & 1 & 0 \end{array}\right]$.
This implication is proved in Theorem 7 in the textbook, but I prefer to give another proof which uses only the facts that elementary matrices are invertible and that matrix multiplication is associative. I also like that the proof below proves the invertibility by using the definition of invertibility. The proof below was suggested to me by a student during Fall Quarter 2021. Unfortunately I forgot who it was. In any case, please think on your own about each proof. You can come up with new proofs.
In the proof below we need to prove that a matrix is invertible. For that it is useful to recall the definition of invertibility.
Step | the row operation | the elementary matrix | the inverse of elementary matrix |
---|---|---|---|
1st | The third row is replaced by the the sum of the first row and the third row | $E_1 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{array}\right]$ | $E_1^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -1 & 0 & 1 \end{array}\right]$ |
2nd | The third row and the second row are interchanged | $E_2 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array}\right]$ | $E_2^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{array}\right]$ |
3rd | The third row is replaced by the the sum of the third row and the second row multiplied by $(-2)$ | $E_3 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -2 & 1 \end{array}\right]$ | $E_3^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 2 & 1 \end{array}\right]$ |
4th | The first row is replaced by the the sum of the first row and the third row | $E_4 = \left[\!\begin{array}{rrr} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]$ | $E_4^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & -1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]$ |
If Reduced Row Echelon Form of $A$ is $I_n$, then $A$ is invertible.
This implication is proved in Theorem 7 in Section 2.2.If RREF of $A$ is $I_3$, then $A$ is invertible.
This implication is proved in Theorem 7 in Section 2.2 . This proof is important!Please pay special attention to problems 27, 28. The pattern that you are asked to explore in these problems we encountered in explorations related to Matrix Multiplication. But here that pattern is put in context of vectors from the same space.
Here is my attempt to improve Figures 2 and 3 in Section 2.1: Matrix Multiplication (page 96).
At the bottom of the snippet that I present below is a proof that for all vectors \(\mathbf{x} \in \mathbb{R}^n\) we have \((AB)\mathbf{x} = A(B\mathbf{x})\). In the proof, I use the definition of the matrix product \(AB\), the definition of the matrix vector multiplication, the linearity property of the matrix-vector multiplication, and the definition of the matrix-vector multiplication.
This proof is given in the book below Figures 2 and 3 on pages 96 and 97. I present it here in hope that you will enjoy it and understand it better in color.
In the following examples, I demonstrate the geometric interpretation of various matrices. The textbook provides additional illustrations for a broader range of matrices.
In the examples below, the happy face is designed by the heads of the following vectors: For the face I used the circle centered at the head of the vector \(\left[\! \begin{array}{c} 1 \\ 1 \end{array} \!\right]\) and with radius \(4/5\). That is the set of the following vectors \[ \Biggl\{ \left[\! \begin{array}{c} 1 \\ 1 \end{array} \!\right] + \frac{4}{5} \left[\! \begin{array}{c} \cos t \\ \sin t \end{array} \!\right] \, : \, t \in [0, 2 \pi) \Biggr\}. \] The navy eyes are at the heads of the following two vectors \[ \frac{1}{5} \left[\! \begin{array}{c} 4 \\ 7 \end{array} \!\right], \quad \frac{1}{5} \left[\! \begin{array}{c} 6 \\ 7 \end{array} \!\right]. \] The red smile, I used the set of the following vectors \[ \Biggl\{ \left[\! \begin{array}{c} 1 \\ 1/2 \end{array} \!\right] + \frac{1}{2} \left[\! \begin{array}{c} 2 t \\ 3 t^2 \end{array} \!\right] \, : \, t \in \left[-\frac{1}{3},\frac{1}{3}\right] \Biggr\}. \]
Problems 41, 42, 43, 44 in Section 1.7: Linear independence are very interesting and important. The matrices in these problems are not easy to row reduce by hand, so the textbook recommends that we use a calculator. Below I calculated RREFs for the matrices given in Problems 41 and 42. Based on these RREFs you should be able to answer Problems 41, 42, 43, 44.
Problem 41 \[ \left[ \begin{array}{rrrrrr} 8 & -3 & 0 & -7 & 2 \\ -9 & 4 & 5 & 11 & -7 \\ 6 & -2 & 2 & -4 & 4 \\ 5 & -1 & 7 & 0 & 10 \\ \end{array} \right] \sim \quad \cdots \quad \sim \left[ \begin{array}{ccccc} 1 & 0 & 3 & 1 & 0 \\ 0 & 1 & 8 & 5 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \]
Problem 42 \[ \left[ \begin{array}{rrrrrr} 12 & 10 & -6 & -3 & 7 & 10 \\ -7 & -6 & 4 & 7 & -9 & 5 \\ 9 & 9 & -9 & -5 & 5 & -1 \\ -4 & -3 & 1 & 6 & -8 & 9 \\ 8 & 7 & -5 & -9 & 11 & -8 \\ \end{array} \right] \sim \quad \cdots \quad \sim \left[ \begin{array}{rrrrrr} 1 & 0 & 2 & 0 & 2 & 0 \\ 0 & 1 & -3 & 0 & -2 & 0 \\ 0 & 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \]
I did not try it, but I believe that Mathematica would solve the corresponding system of \(600\) equations with \(600\) unknowns relatively quickly and that would give a good approximation of the heat distribution in this plate, The advantage of this system us that it has a lot of zeros. The matrix of this system is of the size \(600\times 600\). Thus it has \(360,000\) entries. Based on the patterns that we observed for small matrices above, I estimate that this \(600\times 600\) matrix has less than \(3,000\) nonzero entries. That is less than 1 in 120 entries is nonzero; that is less than 0.083%.
Matrices with a small percentage of nonzero entries are called sparse matrices. Mathematicians have developed super efficient methods of doing calculations with sparse matrices of huge size.
My favorite application of vectors: COLORS. In fact, I love this application so much that I wrote a webpage to celebrate it: Color Cube.
One exercise in this context would be to ask you to find three colors which are between teal and yellow, one in the middle between teal and yellow and the other two in the middle between teal and the mid-color and in the middle of mid-color and yellow.
Section 1.5 talks about writing solution sets of linear systems in parametric vector form.
We explained the relationship between the solution set of the homogeneous equation \[ \color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0} \] and the solution set of a consistent nonhomogeneous equation \[ \color{green}{A}\color{red}{\mathbf{x}} = \color{green}{\mathbf{b}}. \] This is explained in Theorem 6 in the book. Please recognize how this theorem is reflected when the solution of $A \mathbf x = \mathbf b$ is written in parametric vector form.
Suggested problems for Section 1.5: 1, 3, 5, 6, 7, 9, 11, 12, 13-16, 19, 21, 23, 24, 26, 29, 32, 35, 37-40. When you write a formula for the solution of a nonhomogeneous equation in parametric form, try to recognize a particular solution $\color{purple}{\mathbf{p}}$ of the nonhomogeneous equation and a span of one, or two, or three vectors which is the solution of the corresponding homogeneous equation.
At the beginning of class today you asked me about my favorite color. The answer is TEAL. I love using colors on my websites. On my websites, or in Mathematica, to chose colors I always use RGB (red-green-blue) color encoding. That is expression which encodes each color as a triple of numbers. In HTML the triple for TEAL is written as a #008080. Here HTML uses two three two digit numbers in hexadecimal number system. The hexadecimal digits are 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F. HTML encodes colors with triples of integers between 0, which they write as 00 and 255 which in hexadecimal number system is FF. In hexadecimal number system, the number 80 stands for 128, half-way between 0 and 255.
It is hard to see a vector in #008080. However, in CSS, the triple for TEAL can be written as rgb(0%,50%,50%). This is getting closer to a vector, when you view percentages as their fractional values. So, the vector (0%,50%,50%) is (0,1/2,1/2). So, each color can be identified by a vector whose components are real numbers between 0 and 1.
So I started the class by talking about the colors in relation to linear algebra. I love the application of vectors to COLORS so much that I wrote a webpage to celebrate it: Color Cube.
Did you work on Problem 32 in Section 1.3? I find this problem interesting. I like it so much that I decided to rewrite it and give more information in the associated pictures. I do it in the next item.
An interesting feature of the Problem in the next item is that you do not need to know the specific coordinates of the vectors in the picture to answer the questions. You only need to record the linear relationships among vectors that are clear from the given grids: the vectors $\color{green}{\mathbf{a}_3},$ and $\color{#00FF00}{\mathbf{b}}$ are linear combinations of $\color{green}{\mathbf{a}_1},$ $\color{green}{\mathbf{a}_2}.$ At this point you can solve items (i) and (iii) in Problem in the next item. After you solved item (iii), you can reconstruct (ii) But, you will have more information to answer solve (ii) as we learn more about RREF. An important note: The Problem in the next item will be on the final assignment. So, solving it early, and ask for clarifications if there is something not clear, assures success.
The content of Section 1.5 is very useful for the Problem below.
System 1 | Systems 2 | Systems 3 |
---|---|---|
\begin{alignat*}{8} &x_1 & & - 4 &x_2 & &=&& 2\\ -3 &x_1 & & + &x_2 & & =&& 1 \\ &x_1 & & + 2 &x_2 & & =&& -4 \end{alignat*} | \begin{alignat*}{8} &x_1 & & - 4 &x_2 & &=&& 6\\ -3 &x_1 & & + &x_2 & & =&&-7 \\ &x_1 & & + 2 &x_2 & & =&& 0 \end{alignat*} | \begin{alignat*}{8} &x_1 & & - 4 &x_2 & &=&& -7\\ -3 &x_1 & & + &x_2 & & =&& -1 \\ &x_1 & & + 2 &x_2 & & =&& 5 \end{alignat*} |
Above, we focused primarily on practical calculations. However, the key takeaway from today's presentation is that three concepts—a system of linear equations, a linear vector equation, and a matrix equation—are mathematically equivalent.
In the following items, I will provide all the details behind the reasoning for this claim.
It is important to note that the above augmented matrix \eqref{eq:AM} is completely green matrix with $m$ rows and $n+1$ columns. The last column is the augmented column.
Given System | Systems 2 and 3 | "Row Reduced" System |
---|---|---|
\begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= -&&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat*} | \begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&\phantom{5/} 2&&x_3 &&= -&&5\\ & & & &&x_2 & & - &&5/3 &&x_3 &&= -&&6 \\ & & & && & & && && && && \end{alignat*} | \begin{alignat*}{8} &x_1 & & && & & - &&1/3 &&x_3 &&= &&1\\ & & & \phantom{+} &&x_2 & & - &&5/3 &&x_3 &&= -&&6 \\ & & & && & & && && && && \end{alignat*} |
\begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= &&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat*} | \begin{alignat*}{8} &x_1 & & && & & - 1/3 && &&x_3 &&= &&0\\ & & & &&x_2 & & - 5/3 && &&x_3 &&= &&0 \\ 0 &x_1 & & + 0 & & x_2 & & + \ 0 && && x_3 &&= &&1 \end{alignat*} |