Math 204 - Winter 2022

Winter 2022
MATH 204: Introduction to Linear Algebra
Branko Ćurgus

Thursday, March 10, 2022

The most important tool when working with finite-dimensional abstract vector spaces is the concept of a coordinate mapping introduced in Section 4.4 on page 221. Theorem 8 on page 221 and Problems 23-26 on page 225 provide theoretical background on how a coordinate mapping works. How to use a coordinate mapping is explained in Examples 5 and 6.

To use a coordinate mapping on a vector space you need to know a basis for that vector space.
- The standard basis for the vector space $\mathbb{P}_3$ of polynomials is the set of all monomials: \[ \mathcal{M} =\bigl\{ 1, \ x, \ x^2, \ x^3 \bigr\}. \] The corresponding coordinate mapping is \[ \bigl[a_0 + a_1 x + a_2 x^2 + a_3 x^3 \bigr]_{\mathcal{M}} = \left[\!\begin{array}{c} a_0 \\ a_1 \\ a_2 \\ a_3 \end{array}\!\right] \in \mathbb{R}^4. \]
- The standard basis for the vector space $\mathbb{R}^{2\times 2}$ of $2\!\times\!2$ matrices is the set of matrices: \[ \mathcal{S} = \left\{ \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 0 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 0 \\ 1 & 0 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 0 \\ 0 & 1 \end{array}\!\right] \right\}. \] The corresponding coordinate mapping is \[ \Biggl[ \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right] \Biggr]_{\mathcal{S}} = \left[\!\begin{array}{c} a \\ b \\ c \\ d \end{array}\!\right] \in \mathbb{R}^4. \]
The above coordinate mapping can be used to solve this problem: Let \[ J = \left[\!\begin{array}{cc} 1 & 1 \\ 0 & 1 \end{array}\!\right]. \] Find a basis for the following subspace of $\mathbb{R}^{2 \times 2}$: \[ \mathcal{H} = \bigl\{X \in \mathbb{R}^{2\!\times\!2} : JX = XJ \bigr\}. \] Briefly we proceed as follows: We set \[ X = \left[\!\begin{array}{cc} x_1 & x_2 \\ x_3 & x_4 \end{array}\!\right], \quad \text{that is} \quad \bigl[X\bigr]_{\mathcal{S}} = \left[\!\begin{array}{c} x_1 \\ x_2 \\ x_3 \\ x_4 \end{array}\!\right] \] and calculate \[ JX = \left[\!\begin{array}{cc} x_1 + x_3 & x_2 + x_4 \\ x_3 & x_4 \end{array}\!\right] = \left[\!\begin{array}{cc} x_1 & x_1 + x_2 \\ x_3 & x_3 + x_4 \end{array}\!\right] = XJ. \] Therefore $X \in \mathcal{H}$ if and only if $x_1, x_2, x_3, x_4$ satisfy the following system of linear equations: \begin{align*} x_1 + x_3 & = x_1 \\ x_2 + x_4 & = x_1 + x_2 \\ x_3 & = x_3 \\ x_4 & = x_3 + x_4. \end{align*} We can rewrite these equations as \begin{align*} x_1 - x_4 & = 0 \\ x_3 & = 0 \end{align*} The matrix of this homogeneous system is \[ \left[\!\begin{array}{cccc} 1 & 0 & 0 & -1 \\ 0 & 0 & 1 & 0 \end{array}\!\right]. \] The preceding matrix is in RREF with $x_2$ and $x_4$ being free variables. Thus, the solution set of the above homogeneous system is \[ \operatorname{Span}\left\{ \left[\!\begin{array}{c} 1 \\ 0 \\ 0 \\ 1 \end{array}\!\right], \left[\!\begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \end{array}\!\right] \right\}. \] Since \[ \Biggl[ \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\!\right] \Biggr]_{\mathcal{S}} = \left[\!\begin{array}{c} 1 \\ 0 \\ 0 \\ 1 \end{array}\!\right] \quad \text{and} \quad \Biggl[ \left[\!\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array}\!\right] \Biggr]_{\mathcal{S}} = \left[\!\begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \end{array}\!\right], \] we deduce that \[ \mathcal{H} = \operatorname{Span} \left\{ \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\!\right], \left[\!\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array}\!\right] \right\}. \] Here we found out that the identity matrix $I_2$ commutes with $J$, which is trivial. The identity matrix commutes with any matrix. Not only that, scaled identity matrix commutes with any matrix. So the novelty here is that all the matrices that commute with $J$ are of the form \[ a \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\!\right] + b \left[\!\begin{array}{cc} 0 & 1 \\ 0 & 0 \end{array}\!\right] = \left[\!\begin{array}{cc} a & b \\ 0 & a \end{array}\!\right] \quad \text{where} \quad a,b \in \mathbb{R}. \]
The method of coordinate mapping can be used to solve the following problems: in Section 4.3 Problems 33, 34 which relate to linear independence of polynomials, in Section 4.4 Problems 13 and 14 which relate to the coordinates with respect to a basis of polynomials and problems 27, 28, 29, 30, 31, 32 which ask you to use the coordinates of polynomials to answer linear independence and span questions, and in Section 4.5 Problems 21, 22, 23, 24 which ask you to use the coordinates of a set of polynomials to prove that they form a basis of a space of polynomials.
Problems 13 and 14 in Section 4.7 deal with the change of coordinates for polynomials in $\mathbb{P}_2.$

Tuesday, March 8, 2022

On Monday we introduced a concept of an abstract vector space. An abstract vector space I will usually denote by $\mathcal{V},$ or same other nearby capital calygraphy letter, like $\mathcal{U},$ or $\mathcal{W}.$ Individual vectors in these spaces will be denoted by lower case letters, $u,$ $v,$ $w,$ which will often be indexed, like $v_1,$ $v_2,$ and so on.
We talked about the following concepts this week:
- The definition of an abstract vector space on page 192 and posted on March 7.
- The definition of a subspace on page 195
- The definition of the span of a set of vectors in the first paragraph of the subsection "A Subspace Spanned by a Set" starting on page 196. This definition in the set-builder notation reads: Let $m$ be a positive integer and let ${v}_1,\ldots,{v}_m$ be vectors in a vector space $\mathcal{V}.$ The span of vectors ${v}_1,\ldots,{v}_m$ is defined as \[ \operatorname{Span}\bigl\{{v}_1,\ldots,{v}_m\bigr\} = \bigl\{ \alpha_1 {v}_1 + \cdots + \alpha_m {v}_m : \alpha_1,\ldots,\alpha_m \in \mathbb{R} \bigr\}. \] This is a very important theorem on page 196: Theorem 1. $\operatorname{Span}\{{v}_1,\ldots,{v}_m\}$ is a subspace of $\mathcal{V}.$
- The definition of a linearly independent set on page 210. Next, I will restate this definition as an implication: An indexed set of vectors $\{{v}_1,\ldots,{v}_m\}$ in a vector space $\mathcal{V}$ is said to be linearly independent if the following implication holds \[ \alpha_1 {v}_1 + \cdots + \alpha_m {v}_m = {0} \quad \text{implies} \quad \alpha_k = 0 \quad \text{for all} \quad k \in \{1,\ldots,m\}. \] There are many other equivalent ways of stating this definition. However, the above statement is the only formal definition which is easiest to use when we need to prove that certain vectors are linearly independent.
- The definition of a basis on page 211.
The relevant section in the book is Section 4.3: Linear independent sets; bases. Suggested problems for Section 4.3: 3, 4, 5, 9, 10, 11, 13, 14, 15, 21, 22, 23, 25, 26, 33, 34. This is a good opportunity to review linear independence of vectors in $\mathbb{R}^n.$ Most of these problems deal with vectors in $\mathbb{R}^n.$ Pay special attention to problems 33, 34 which deal with polynomials and 37 and 38 which deal with trigonomatric functions.
Important examples of finite dimensional vector spaces are spaces of polynomials. For $n\in\mathbb{N}$ by $\mathbb{P}_n$ we denote the vector space of all polynomials of degree less or equal to $n.$ The most important step in understanding the vector space $\mathbb{P}_n$ is establishing that the monomials form a basis of this space. I wrote this webpage with a proof which uses only linear algebra. The proof which I give below for $\mathbb{P}_2$ uses calculus. The proof which is given in our textbook uses the Fundamental Theorem of Algebra (which is more difficult to prove.)
The friendliest among spaces $\mathbb{P}_n$ is the spaces $\mathbb{P}_2$. Below I explore some of the properties of $\mathbb{P}_2$.
- $\mathbb{P}_2$ denotes the vector space of all polynomials of degree less or equal $2.$ That is, in set-builder notation, \[ \mathbb{P}_2 = \bigl\{a_0 + a_1 x +a_2 x^2 \, : \, a_0, a_1, a_2 \in \mathbb{R} \bigr\}. \] Recall that the constant polynomial $1$ is a polynomial in $\mathbb{P}_2$. To get this polynomial in the above set-builder notation we take $a_0 = 1,$ $a_1 = 0,$ and $a_2 = 0.$ To get the polynomial $x$ in the above set-builder notation we take $a_0 = 0,$ $a_1 = 1,$ and $a_2 = 0.$ Similarly, to get the square polynomial $x^2$ in the above set-builder notation we take $a_0 = 0,$ $a_1 = 0,$ and $a_2 = 1.$ Using the concept of the span the above expression for $\mathbb{P}_2$ in set-builder notation can be written using the concept of the span as \[ \mathbb{P}_2 = \operatorname{Span}\bigl\{ 1, x, x^2 \bigr\}. \]
- Here is a proof that the polynomials $1, x, x^2$ are linearly independent in the vector space ${\mathbb P}_2$.
  
  Assume that $\alpha_1,$ $\alpha_2,$ and $\alpha_3$ are scalars in $\mathbb{R}$ such that \[ \require{bbox} \bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1\cdot 1 + \alpha_2 x + \alpha_3 x^2 =0 \quad \text{for all} \quad x \in \mathbb{R}}. \] The objective here is to prove \[ \bbox[5px, #FF4444, border: 1pt solid red]{\alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}. \] Consider the left-hand side of the above green identity as a function of $x$ and take the derivative with respect to $x$. We obtain \[ \bbox[5px, #88FF88, border: 1pt solid green]{\alpha_2 + 2 \alpha_3 x =0 \quad \text{for all} \quad x \in \mathbb{R}}. \] Again, consider the left-hand side of the above green identity as a function of $x$ and take the derivative with respect to $x$. We obtain \[ \bbox[5px, #88FF88, border: 1pt solid green]{2 \alpha_3 =0 \quad \text{for all} \quad x \in \mathbb{R}}. \] Substituting $x=0$ in the first two green identities and dividing the third green equality by $2$ we obtain \[ \bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}. \] In this way we have greenifyed the red statement. That is, we proved it.
- Here is an alternative proof that the polynomials $1, x, x^2$ are linearly independent in the vector space ${\mathbb P}_2$.
  
  Assume that $\alpha_1,$ $\alpha_2,$ and $\alpha_3$ are scalars in $\mathbb{R}$ such that \[ \require{bbox} \bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1\cdot 1 + \alpha_2 x + \alpha_3 x^2 =0 \quad \text{for all} \quad x \in \mathbb{R}}. \] The objective here is to prove \[ \bbox[5px, #FF4444, border: 1pt solid red]{\alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}. \] The above green identity holds for all $x\in\mathbb{R}.$ In particular it holds for specific $x=-1,$ $x=0,$ and $x=1.$ That is, we have \[ \bbox[5px, #88FF88, border: 1pt solid green]{ \begin{array}{lr} \alpha_1 - \alpha_2 +\alpha_3 &=0 \\ \alpha_1 &=0 \\ \alpha_1 + \alpha_2 +\alpha_3 &=0 \\ \end{array} } \] The last green box contains a homogeneous system of linear equations which can be written in a matrix form as \[ \bbox[5px, #88FF88, border: 1pt solid green]{ \left[\!\begin{array}{rrr} 1 & -1 & 1 \\ 1 & 0 & 0 \\ 1 & 1 & 1 \end{array}\!\right] \left[\!\begin{array}{c} \alpha_1 \\ \alpha_2 \\ \alpha_3 \end{array}\!\right] = \left[\!\begin{array}{c} 0 \\ 0 \\ 0 \end{array}\!\right] } \] Since the determinant of the above $3\!\times\!3$ matrix is $2$, the above homogeneous equation has only the trivial solution. That is, \[ \bbox[5px, #88FF88, border: 1pt solid green]{\alpha_1 = 0, \quad \alpha_2 =0, \quad \alpha_3 = 0}. \] In this way we have greenifyed the red statement. That is, we proved it.
- Hence, \[ \mathbb{P}_2 = \operatorname{Span}\bigl\{ 1, x, x^2 \bigr\} \] and $\mathcal{M} = \bigl\{ 1, x, x^2 \bigr\}$ is a basis for $\mathbb{P}_2.$ I denoted this basis by $\mathcal{M}$ since the polynomials $1,$ $x,$ $x^2$ are called monomials.

Monday, March 7, 2022

We did Section 4.1 today. Recommended exercises: 1, 3, 5, 6, 7, 8, 9, 11 13, 15, 16, 17, 18, 21, 22, 23, 24.
Here I recall the definition of a vector space as I stated it in class.

Definition. A nonempty set $\mathcal{V}$ is said to be a vector space over $\mathbb R$ if it satisfies the following ten axioms.

Axiom 1. (AE) There exists a function $+: \mathcal{V}\!\times\!\mathcal{V} \to \mathcal{V}.$

That is, for each pair $(u,v) \in \mathcal{V}\!\times\!\mathcal{V}$ there exists a unique $u+v \in \mathcal{V}$ which is called the sum of $u$ and $v.$

Axiom 2. (AA) For every $u, v, w \in \mathcal{V}$ we have $u+(v+w) = (u+v)+w$

Axiom 3. (AC) For every $u, v \in \mathcal{V}$ we have $u+v = v+u$

Axiom 4. (AZ) There exists $0 \in \mathcal{V}$ such that for every $v \in \mathcal{V}$ we have $v+0 = v$

Axiom 5. (AO) For every $v \in \mathcal{V}$ there exists $-v \in \mathcal{V}$ such that $v+(-v) = 0$

Axiom 6. (SE) There exists a function $\cdot: \mathbb{R}\!\times\!\mathcal{V} \to \mathcal{V}.$

That is, for each real number $\alpha \in \mathbb R$ and each $v \in \mathcal{V}$ there exists a unique $\alpha v \in \mathcal{V}$ which is called the scalar product of $\alpha$ and $v.$

Axiom 7. (SA) For every $\alpha, \beta \in \mathbb R$ and every $v \in \mathcal{V}$ we have $\alpha (\beta v) = (\alpha\beta) v$

Axiom 8. (SD) For every $\alpha, \beta \in \mathbb R$ and every $v \in \mathcal{V}$ we have $(\alpha +\beta) v = \alpha v + \beta v$

Axiom 9. (SD) For every $\alpha \in \mathbb R$ and every $u, v \in \mathcal{V}$ we have $\alpha (u + v) = \alpha u + \alpha v$

Axiom 10. (S0) For every $v \in \mathcal{V}$ we have $1 v = v$

Explanation of the abbreviations: AE--addition exists, AA--addition is associative, AC--addition is commutative, AZ--addition has zero, AO--addition has opposites, SE-- scaling exists, SA--scaling is associative, SD--scaling distributes over addition of real numbers, SD--scaling distributes over addition of vectors, SO--scaling with one.
To illustrate the definition of a vector space we studied an exotic vector space: \[ \require{bbox} \mathcal{V} = \left\{ \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} \, : \, x_1, x_2 \in \mathbb{R} \ \ \text{and} \ \ x_1, x_2 \gt 0 \right\} \] with the addition defined as \[ \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} \bbox[yellow]{+} \bbox[yellow]{\left[\! \begin{array}{c} y_1 \\ y_2 \end{array} \!\right]} = \bbox[yellow]{\left[\! \begin{array}{c} x_1 y_1 \\ x_2 y_2 \end{array} \!\right]} \quad \text{for all} \quad \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]}, \bbox[yellow]{\left[\! \begin{array}{c} y_1 \\ y_2 \end{array} \!\right]} \in \mathcal{V} \] and scaling defined as \[ \alpha \bbox[yellow]{\phantom{*}} \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} = \bbox[yellow]{\left[\! \begin{array}{c} (x_1)^\alpha \\ (x_2)^\alpha \end{array} \!\right]} \quad \text{for all} \quad \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]}\in \mathcal{V} \quad \text{and for all} \quad \alpha \in \mathbb{R}. \] It is a nice exercise to verify that all the axioms of a vector space are satisfied. Interestingly, the zero vector in the vector space $\mathcal{V}$ is the vector $\bbox[yellow]{\left[\! \begin{array}{c} 1 \\ 1 \end{array} \!\right]}.$
I colored the vectors in $\mathcal{V}$ yellow to distinguish them from the vectors in $\mathbb{R}^2.$
Now consider the function $L:\mathcal{V} \to \mathbb{R}^2$ defined as follows \[ L \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} = \left[\! \begin{array}{c} \ln x_1 \\ \ln x_2 \end{array} \!\right] \quad \text{for all} \quad \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]}\in \mathcal{V}. \] This function has the following two properties: \[ L \left(\bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} \bbox[yellow]{+} \bbox[yellow]{\left[\! \begin{array}{c} y_1 \\ y_2 \end{array} \!\right]} \right) = L \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} + L \bbox[yellow]{\left[\! \begin{array}{c} y_1 \\ y_2 \end{array} \!\right]} \quad \text{for all} \quad \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]}, \bbox[yellow]{\left[\! \begin{array}{c} y_1 \\ y_2 \end{array} \!\right]} \in \mathcal{V} \] and \[ L \left(\alpha\bbox[yellow]{\phantom{*}} \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} \right) = \alpha L \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} \quad \text{for all} \quad \bbox[yellow]{\left[\! \begin{array}{c} x_1 \\ x_2 \end{array} \!\right]} \in \mathcal{V} \quad \text{and for all} \quad \alpha \in \mathbb{R}. \]
These two properties make $L:\mathcal{V} \to \mathbb{R}^2$ a linear transformation defined on $\mathcal{V}$ to $\mathbb{R}^2.$
It is interesting to study subspaces of the vector space $\mathcal{V}.$ For example, what is \[ \operatorname{Span} \left\{ \bbox[yellow]{\left[\! \begin{array}{c} 1 \\ 2 \end{array} \!\right]} \right\}, \] or, what is \[ \operatorname{Span} \left\{ \bbox[yellow]{\left[\! \begin{array}{c} 2 \\ 1 \end{array} \!\right]} \right\}, \] or, \[ \operatorname{Span} \left\{ \bbox[yellow]{\left[\! \begin{array}{c} 2 \\ 2 \end{array} \!\right]} \right\}, \] or \[ \operatorname{Span} \left\{ \bbox[yellow]{\left[\! \begin{array}{c} 4 \\ 2 \end{array} \!\right]} \right\}, \] or, \[ \operatorname{Span} \left\{ \bbox[yellow]{\left[\! \begin{array}{c} 2 \\ 4 \end{array} \!\right]} \right\}. \]
I am posting two challenging problems related to this section.

Friday, March 4, 2022

Recommended exercises for Section 5.3 are 2, 3, 5, 8, 9, 12, 13, 16, 18, 20, 23, 24.
Today we proved the following theorem

Theorem. Let $n \in \mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. The matrix $A$ is diagonalizable if and only if there exists a basis of $\mathbb{R}^n$ which consists of eigenvectors of $A.$

The above formulation of this important theorem is short. It requires the understanding of the definition of a diagonalizable matrix and the understanding of the definition of a basis of $\mathbb{R}^n$ and the understanding of the concept of an eigenvector of $A.$ A long statement of the theorem which incorporates all these definitions is as follows:

Theorem. Let $n \in \mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. The following two statements are equivalent:
(a) There exist an invertible $n\!\times\!n$ matrix $P$ and a diagonal $n\!\times\!n$ matrix $D$ such that $A= PDP^{-1}.$

(b) There exist linearly independent vectors $\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n$ in $\mathbb{R}^n$ and real numbers $\lambda_1, \lambda_2,\ldots,\lambda_n$ such that $A \mathbf{v}_k = \lambda_k \mathbf{v}_k$ for all $k\in \{1,\ldots,n\}.$
Proof of (b)⇒(a). Assume (b). That is, assume that there exist linearly independent vectors $\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n$ in $\mathbb{R}^n$ and real numbers $\lambda_1, \lambda_2,\ldots,\lambda_n$ such that \[ A \mathbf{v}_k = \lambda_k \mathbf{v}_k \quad \text{for all} \quad k\in \{1,\ldots,n\}. \] The preceding displayed expression contains $n$ vector equations. These $n$ vector equations can be expressed as one matrix equation \[ \Bigl[ A \mathbf{v}_1 \ A \mathbf{v}_2 \ \cdots \ A \mathbf{v}_n \Bigr] = \Bigl[ \lambda_1 \mathbf{v}_1 \ \lambda_2 \mathbf{v}_2 \ \cdots \ \lambda_n \mathbf{v}_n \Bigr]. \] By the definition of the matrix multiplication the preceding matrix equality can be expressed as \[ A \Bigl[ \mathbf{v}_1 \ \mathbf{v}_2 \ \cdots \ \mathbf{v}_n \Bigr] = \Bigl[ \mathbf{v}_1 \ \mathbf{v}_2 \ \cdots \ \mathbf{v}_n \Bigr] \left[\!\begin{array}{cccc} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{array}\!\right]. \] The preceding matrix equality can be written as \[ A P = P D, \] where we set \[ P = \Bigl[ \mathbf{v}_1 \ \mathbf{v}_2 \ \cdots \ \mathbf{v}_n \Bigr] \quad \text{and} \quad D = \left[\!\begin{array}{cccc} \lambda_1 & 0 & \cdots & 0 \\ 0 & \lambda_1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \lambda_n \end{array}\!\right]. \] Since we assume that $\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_n$ are linearly independent vectors in $\mathbb{R}^n$ the matrix $P$ is invertible by the Invertible Matrix Theorem. Therefore, the matrix $P^{-1}$ exists. Multiplying the equality \[ AP=PD \] from the right by $P^{-1}$ we get \[ A = PDP^{-1}. \] This proves (a).
Now we recall Example 1 from Monday, February 28, 2022. We calculated that \begin{align*} \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right] & = 1 \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right], \\ \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \!\right] & = 2 \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \!\right], \\ \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right]\left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right] & = 2 \left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right]. \end{align*} and \[ \left[\! \begin{array}{crc|ccc} 1 & -1 & 1 & 1 & 0 & 0\\ 1 & 1 & 0 & 0 & 1 & 0 \\ 3 & 0 & 1 & 0 & 0 & 1 \end{array} \!\right] \sim \cdots \sim \left[\! \begin{array}{ccc|rrc} 1 & 0 & 0 & -1 & -1 & 1\\ 0 & 1 & 0 & 1 & 2 & -1 \\ 0 & 0 & 1 & 3 & 3 & -2 \end{array} \!\right]. \]
In the preceding item we have that \[ A = \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] \] is a $3\!\times\!3$ matrix. The vectors \[ \mathbf{v}_1 = \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right], \quad \mathbf{v}_2 = \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \!\right], \quad \mathbf{v}_3 = \left[\! \begin{array}{r} 1 \\ 0 \\ 1 \end{array} \!\right] \] are linearly independent vectors in $\mathbb{R}^3$ which are eigenvectors of $A$ such that \[ A \mathbf{v}_1 = 1 \mathbf{v}_1, \quad A \mathbf{v}_2 = 2 \mathbf{v}_2, \quad A \mathbf{v}_3 = 2 \mathbf{v}_3. \] Based on the proof presented above the preceding three vector equations can be written as one matrix equation as follows (this is the equation $AP=PD$ in the proof) \[ \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] \left[\! \begin{array}{crc} 1 & -1 & 1\\ 1 & 1 & 0 \\ 3 & 0 & 1 \end{array} \!\right] = \left[\! \begin{array}{crc} 1 & -1 & 1\\ 1 & 1 & 0 \\ 3 & 0 & 1 \end{array} \!\right] \left[\! \begin{array}{crc} 1 & 0 & 0\\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{array} \!\right] \] Multiplying the preceding matrix equation with the inverse calculated in the previous item \[ \left[\! \begin{array}{crc} 1 & -1 & 1\\ 1 & 1 & 0 \\ 3 & 0 & 1 \end{array} \!\right]^{-1} = \left[\! \begin{array}{rrr} -1 & -1 & 1 \\ 1 & 2 & -1 \\ 3 & 3 & -2 \end{array} \!\right], \] we get the diagonalization of the matrix $A$: \[ \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] = \left[\! \begin{array}{rrr} 1 & -1 & 1 \\ 1 & 1 & 0 \\ 3 & 0 & 1 \end{array} \!\right] \left[\! \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{array} \!\right] \left[\! \begin{array}{rrr} -1 & -1 & 1 \\ 1 & 2 & -1 \\ 3 & 3 & -2 \end{array} \!\right] . \] Notice that this is a specific numerical matrix equality and it takes just a little bit of patience to verify whether it is true or not.

Thursday, March 3, 2022

Proofs are important aspect of mathematics. One proof that you need to understand is Theorem 2 in Section 5.1. Since I like stating important theorems in English without using any symbols, I restate this theorem in English without any symbols.

Theorem Eigenvectors of a square matrix which correspond to distinct eigenvalues are linearly independent.

Theorem Eigenvectors of a square matrix which correspond to distinct eigenvalues are linearly independent.
I consider the proof of this theorem in the book to be somewhat cryptic. Therefore I present a different proof of this theorem here.

Wednesday, March 2, 2022

In this post I present a remarkable application of eigenvalues and eigenvectors to a famous sequence of positive integers: Fibonacci numbers.
Recall that $\mathbb{N}$ denotes the set of positive integers. The Fibonacci numbers are the elements of the sequence \[ f_0, f_1, f_2, \ldots, f_n, \ldots \] recursively defined by \[ f_0 = 0, \quad f_1 = 1, \quad \text{and} \quad f_{n+1} = f_n + f_{n-1} \quad \text{for} \quad n \in \mathbb{N}. \] Since we are given $f_0 = 0$ and $f_1 = 1$ using the recursion $f_{n+1} = f_n + f_{n-1}$ with $n=1$ we get $f_2 = 0+1 = 1.$ A repeated use of the recursion $f_{n+1} = f_n + f_{n-1}$ with $n=2$, then $n=3$, and so on, we get \begin{multline*} f_0 = 0, \ f_1 = 1, \ f_2 = 1, \ f_3 = 2, \ f_4 = 3, \ f_5 = 5, \ f_6 = 8, \ f_7 = 13, \ f_8 = 21, \ \\ f_9 = 34, \ f_{10} = 55, \ f_{11} = 89, \ f_{12} = 144, \ f_{13} = 233, \ f_{14} = 377, \ \ldots \ \end{multline*} It is clear that with enough patience we can calculate $f_{100}$ by calculating all Fibonacci numbers preceding it. However, that would take quite a bit of patience, since \[ f_{100} = 354,224,848,179,261,915,075. \] Computers are really good with doing recursively defined operations. I wrote a webpage to get you started with Mathematica. If you want to try using Mathematica on WWU computers please let me know. I want to help you with that.
Since in Mathematics we like to be able to approach mathematical concepts in diverse ways, we are interested in finding a formula for the $n$-th Fibonacci number without calculating all the preceding Fibonacci numbers; a formula that will use only $n$ and algebraic operations. Amazingly, Linear Algebra offers a way to do that. The next items will illustrate how that comes about.
The first step is to write the recutsion \[ f_{n+1} = f_n + f_{n-1} \] using a matrix: \[ \left[\!\begin{array}{l} f_{n} \\ f_{n+1} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_{n-1} \\ f_{n} \end{array}\!\right]. \] Thus, we can obtain the Fibonacci sequence by repeated application of the matrix $\displaystyle\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]$ as follows \begin{align*} \left[\!\begin{array}{l} f_{1} \\ f_{2} \end{array}\!\right] & = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_0 \\ f_{1} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \\ \left[\!\begin{array}{l} f_{2} \\ f_{3} \end{array}\!\right] & = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_{1} \\ f_{2} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^2 \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \\ \left[\!\begin{array}{l} f_{3} \\ f_{4} \end{array}\!\right] &= \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_{2} \\ f_{3} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^2 \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^3 \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \\ \left[\!\begin{array}{l} f_{4} \\ f_{5} \end{array}\!\right] &= \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_{3} \\ f_{4} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^3 \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^4 \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \\ & \ \vdots \\ \left[\!\begin{array}{l} f_{n} \\ f_{n+1} \end{array}\!\right] &= \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_{n-1} \\ f_{n} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^{n-1} \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^{n} \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \\ \left[\!\begin{array}{l} f_{n+1} \\ f_{n+2} \end{array}\!\right] &= \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{l} f_{n} \\ f_{n+1} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^{n} \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^{n+1} \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \end{align*}
In the preceding item we saw that \[ \left[\!\begin{array}{l} f_{n} \\ f_{n+1} \end{array}\!\right] = \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right]. \] Therefore, \[ f_n = \bigl[ 1 \ \ 0 \bigr] \left[\!\begin{array}{l} f_{n} \\ f_{n+1} \end{array}\!\right] = \bigl[ 1 \ \ 0 \bigr] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right]. \] We could stop here and pronounce that this is sufficiently good formula for $f_n$ which uses only $n$ and matrix algebra. However, we want a formula for $f_n$ which uses only algebra with specific numbers, without matrices. To obtain such a formula we will calculate the eigenvalues and eigenvectors of the matrix $\displaystyle\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]$.
First we calculate the characteristic polynomial of $\displaystyle\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]$: \[ \det\left( \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right] - \left[\!\begin{array}{cc} \lambda & 0 \\ 0 & \lambda \end{array}\!\right] \right) = \left|\!\begin{array}{cc} -\lambda & 1 \\ 1 & 1-\lambda \end{array}\!\right| = -\lambda(1-\lambda) - 1 = \lambda^2 - \lambda - 1 \] The roots of the characteristic polynomial are \[ \lambda_1 = \frac{1+\sqrt{5}}{2} = \varphi \quad \text{and} \quad \lambda_2 = \frac{1-\sqrt{5}}{2} = \psi. \] The number $\varphi$ is the famous number Golden ratio. The Greek letter $\varphi$ is the standard notation for the Golden ratio. We introduce $\psi$ since we will use it several times below.
An eigenvector of $\displaystyle\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]$ corresponding to $\varphi$ is $\displaystyle\left[\!\begin{array}{l} 1 \\ \varphi \end{array}\!\right]$ and an eigenvector corresponding to $\psi$ is $\displaystyle\left[\!\begin{array}{l} 1 \\ \psi \end{array}\!\right]$. Please verify that \[ \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]\left[\!\begin{array}{l} 1 \\ \varphi \end{array}\!\right] = \varphi \left[\!\begin{array}{l} 1 \\ \varphi \end{array}\!\right] \quad \text{and} \quad \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]\left[\!\begin{array}{l} 1 \\ \psi \end{array}\!\right] = \psi \left[\!\begin{array}{l} 1 \\ \psi \end{array}\!\right]. \] To verify the preceding vector equalities you will use that $\varphi$ and $\psi$ are the roots of the characteristic polynomial, that is \[ \varphi^2 - \varphi - 1 = 0 \quad \text{and} \quad \psi^2 - \psi - 1 = 0. \] One of the important properties of eigenvectors is that it is easy to calculate the action of the powers of the matrix on eigenvectors: \[ \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{c} 1 \\ \varphi \end{array}\!\right] = \varphi^n \left[\!\begin{array}{c} 1 \\ \varphi \end{array}\!\right] \quad \text{and} \quad \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right] = \psi^n \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right]. \]
To improve the formula \[ f_n = \bigl[ 1 \ \ 0 \bigr] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{c} 0 \\ 1 \end{array}\!\right], \] we will represent the vector $\displaystyle\left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right]$ as a linear combination of the eigenvectors: \[ \left[\!\begin{array}{c} 0 \\ 1 \end{array}\!\right] = x_1 \left[\!\begin{array}{c} 1 \\ \varphi \end{array}\!\right] + x_2 \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right]. \] We do not need to do row reduction to solve this system. Since $x_1+x_2 = 0$ we deduce that $x_2 = -x_1$. Then we have \[ x_1 (\varphi - \psi) = 1. \] Since $\varphi - \psi = \sqrt{5}$ we have \[ \left[\!\begin{array}{c} 0 \\ 1 \end{array}\!\right] = \frac{1}{\sqrt{5}} \left[\!\begin{array}{c} 1 \\ \varphi \end{array}\!\right] - \frac{1}{\sqrt{5}} \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right]. \] Therefore, \begin{align*} f_n & = \bigl[ 1 \ \ 0 \bigr] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{l} 0 \\ 1 \end{array}\!\right] \\ & = \bigl[ 1 \ \ 0 \bigr] \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left( \frac{1}{\sqrt{5}} \left[\!\begin{array}{c} 1 \\ \varphi \end{array}\!\right] - \frac{1}{\sqrt{5}} \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right] \right) \\ & = \bigl[ 1 \ \ 0 \bigr] \left( \frac{1}{\sqrt{5}} \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{c} 1 \\ \varphi \end{array}\!\right] - \frac{1}{\sqrt{5}} \left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]^n \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right] \right) \\ & = \bigl[ 1 \ \ 0 \bigr] \left( \frac{1}{\sqrt{5}} \varphi^n \left[\!\begin{array}{l} 1 \\ \varphi \end{array}\!\right] - \frac{1}{\sqrt{5}} \psi^n \left[\!\begin{array}{c} 1 \\ \psi \end{array}\!\right] \right) \\ & = \frac{1}{\sqrt{5}} \bigl[ 1 \ \ 0 \bigr] \left( \left[\!\begin{array}{c} \varphi^n \\ \varphi^{n+1} \end{array}\!\right] - \left[\!\begin{array}{c} \psi^{n} \\ \psi^{n+1} \end{array}\!\right] \right) \\ & = \frac{1}{\sqrt{5}} \bigl[ 1 \ \ 0 \bigr] \left[\!\begin{array}{c} \varphi^n - \psi^{n} \\ \varphi^{n+1} -\psi^{n+1} \end{array}\!\right]\\ & = \frac{1}{\sqrt{5}} \bigl( \varphi^n - \psi^{n} \bigr). \end{align*} Thus, finally and amazingly, \[ f_n = \frac{1}{\sqrt{5}} \bigl( \varphi^n - \psi^{n} \bigr) = \frac{1}{\sqrt{5}} \left( \frac{(1+\sqrt{5})^n}{2^n} - \frac{(1-\sqrt{5})^n}{2^n} \right) = \frac{ (1+\sqrt{5})^n - (1-\sqrt{5})^n }{2^n \sqrt{5}} \] for all nonnegative integers $n$. A formula like this in which $f_n$ is given in terms of $n$ and standard functions is called a closed form expression. A lot of effort in mathematics has been put in finding closed form expressions for different mathematical objects.
A closed form expression for the Fibonacci numbers. In the preceding items we used eigenvectors of the matrix $\left[\!\begin{array}{cc} 0 & 1 \\ 1 & 1 \end{array}\!\right]$ to deduce the following closed form expression for the Fibonacci numbers: \[ \text{for all} \quad n \in \mathbb{N} \qquad f_{n} = \frac{1}{\sqrt{5}}\Biggl( \biggl(\frac{1+\sqrt{5}}{2}\biggr)^n - \biggl(\frac{1-\sqrt{5}}{2}\biggr)^n \Biggr). \] The difficulty with the recursive formula for the Fibonacci numbers is that we have to calculate all the numbers preceding $f_n$ in order to calculate $f_n.$ The difficulty with the closed form expression for the Fibonacci numbers is that calculating accurate powers \[ \biggl(\frac{1+\sqrt{5}}{2}\biggr)^n \quad \text{and} \quad \biggl(\frac{1-\sqrt{5}}{2}\biggr)^n \] for large values for $n \in\mathbb{N}$, like $n=100$, is difficult.

It is important to mention that the irrational number \[ \varphi = \frac{1+\sqrt{5}}{2} \] is the famous number called Golden Ratio.

We have that \[ \frac{1-\sqrt{5}}{2} = \frac{\bigl(1-\sqrt{5}\bigr)\bigl(1+\sqrt{5}\bigr)}{2\bigl(1+\sqrt{5}\bigr)} = \frac{1-5}{2\bigl(1+\sqrt{5}\bigr)} = - \frac{2}{1+\sqrt{5}} = - \frac{1}{\varphi} = - \varphi^{-1}. \] Therefore, the closed form expression for the Fibonacci numbers can be written as \[ \text{for all} \quad n \in \mathbb{N} \qquad f_{n} = \frac{1}{\sqrt{5}}\Bigl( \varphi^n - (-1)^n \varphi^{-n} \Bigr) \quad \text{where} \quad \varphi = \frac{1+\sqrt{5}}{2}. \]

Tuesday, March 1, 2022

Consider the matrix \[ A = \frac{1}{5} \left[\! \begin{array}{rr} 4 & 3/2 \\[5pt] 1 & 7/2 \end{array} \!\right] \qquad A = \left[\! \begin{array}{rr} 0.8 & 0.3 \\[5pt] 0.2 & 0.7 \end{array} \!\right] \] These two matrices are identical. I write both, since the decimal numbers are more informative, while I prefer to do calculations with fractions.
Today we discussed the behaviour of the sequence of vectors \[ A^n\left[\!\begin{array}{c} 1 \\ 1 \end{array} \!\right] \] for large values of positive integer $n.$ To find the formula for the preceding displayed vector, we recall that for an eigenvector $\mathbf{v}$ of $A$ corresponding to an eigenvalue $\lambda$ the calculation of $A^n\mathbf{v}$ is easy $A^n\mathbf{v} = \lambda^n\mathbf{v}.$ So, in the next item we find eigenvalues and eigenvectors of $A$.
First find the characteristic polynomial of $A$: \begin{align*} \left| \begin{array}{cc} \frac{4}{5} -\lambda & \frac{3}{10} \\[7pt] \frac{1}{5} & \frac{7}{10} -\lambda \end{array} \right| & = \left( \frac{4}{5} -\lambda \right) \left( \frac{7}{10} -\lambda \right) - \frac{3}{50} \\ & = \lambda^2 - \frac{3}{2} \lambda + \frac{14}{25} - \frac{3}{50} \\ & = \lambda^2 - \frac{3}{2} \lambda + \frac{1}{2} \\ & = \left( \lambda - 1 \right)\left( \lambda - \frac{1}{2} \right). \end{align*} Thus the eigenvalues of $A$ are $1$ and $1/2.$ Let us find corresponding eigenvectors.
First the eigenvalue $1.$ We need to find the nullspace of \[ \left[ \begin{array}{cc} \frac{4}{5} - 1 & \frac{3}{10} \\[7pt] \frac{1}{5} & \frac{7}{10} -1 \end{array} \right] = \left[ \begin{array}{cc} -\frac{1}{5} & \frac{3}{10} \\[7pt] \frac{1}{5} & -\frac{3}{10} \end{array} \right] = \frac{1}{10} \left[ \begin{array}{cc} -2 & 3 \\[5pt] 2 & - 3 \end{array} \right]. \] Clearly, the nullspace of the last matrix is one-dimensional and a basis vector for the nullspace is $\left[\!\begin{array}{c} 3 \\ 2\end{array} \right].$ Thus $\left[\!\begin{array}{c} 3 \\ 2\end{array}\!\right]$ is an eigenvector corresponding to the eigenvalue $1.$

Now, find an eigenvector corresponding to the eigenvalue $1/2.$ We need to find the nullspace of \[ \left[ \begin{array}{cc} \frac{4}{5} - \frac{1}{2} & \frac{3}{10} \\[7pt] \frac{1}{5} & \frac{7}{10} - \frac{1}{2} \end{array} \right] = \left[ \begin{array}{cc} \frac{3}{10} & \frac{3}{10} \\[7pt] \frac{1}{5} & \frac{2}{10} \end{array} \right] = \frac{1}{10} \left[ \begin{array}{cc} 3 & 3 \\[5pt] 2 & 2 \end{array} \right]. \] Clearly, the nullspace of the last matrix is one-dimensional and a basis vector for the nullspace is $\left[\!\begin{array}{c} 1 \\ -1 \end{array} \right].$ Thus $\left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right]$ is an eigenvector corresponding to the eigenvalue $1/2.$
In conclusion, we found two linearly independent eigenvectors of $A$ \[ \left[\!\begin{array}{c} 3 \\ 2 \end{array}\!\right], \quad \left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right]. \] These two vectors form a basis for $\mathbb{R}^2.$ Thus, we can write the vector $\left[\!\begin{array}{c} 1 \\ 1 \end{array} \!\right]$ as a linear combination of the eigenvectors that we found: \[ \left[\!\begin{array}{c} 1 \\ 1 \end{array} \!\right] = \frac{2}{5} \left[\!\begin{array}{c} 3 \\ 2 \end{array}\!\right] - \frac{1}{5} \left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right]. \]
Finally, using the properties of the matrix-vector multiplication and the fact that we found eigenvectors we can calculate \begin{align*} A^n\left[\!\begin{array}{c} 1 \\ 1 \end{array} \!\right] & = A^n \left( \frac{2}{5} \left[\!\begin{array}{c} 3 \\ 2 \end{array}\!\right] - \frac{1}{5} \left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right] \right) \\ & = \frac{2}{5} A^n \left[\!\begin{array}{c} 3 \\ 2 \end{array}\!\right] - \frac{1}{5} A^n \left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right] \\ & = \frac{2}{5} 1^n \left[\!\begin{array}{c} 3 \\ 2\end{array}\!\right] - \frac{1}{5} \left(\frac{1}{2}\right)^n \left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right] \\ & = \frac{2}{5} \left[\!\begin{array}{c} 3 \\ 2 \end{array}\!\right] - \frac{1}{5} \frac{1}{2^n}\left[\!\begin{array}{c} 1 \\ -1\end{array}\!\right]. \end{align*} Since $\displaystyle\frac{1}{2^n}$ is very close to $0$ when $n$ is a large positive number, we deduce that \[ A^n\left[\!\begin{array}{c} 1 \\ 1 \end{array} \!\right] \quad \text{is very close to} \quad \frac{2}{5} \left[\!\begin{array}{c} 3 \\ 2\end{array}\!\right] = \left[\!\begin{array}{c} 1.2 \\ 0.8 \end{array}\!\right]. \] Using the concept of limit we can write \[ \lim_{n \to \infty}A^n\left[\!\begin{array}{c} 1 \\ 1 \end{array} \!\right] = \left[\!\begin{array}{c} 1.2 \\ 0.8 \end{array}\!\right]. \]

Monday, February 28, 2022

For Section 5.2 do 1-8, 11, 12, 14, 15, (in all these problems you can find eigenvectors as well) 9, 13, 18, 19, 20, 21, 24, 25, 27.
Example 1. In this item I will illustrate how to calculate eigenvalues and the corresponding eigenspaces of a specific $3\!\times\!3$ matrix. Consider the matrix \[ A = \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] . \]
- First we find the characteristic polynomial of this matrix. The characteristic polynomial is the determinant of the following matrix: \[ A - \lambda I_3 = \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] - \left[\! \begin{array}{rrr} \lambda & 0 & 0 \\ 0 & \lambda & 0 \\ 0 & 0 & \lambda \end{array} \!\right] = \left[\! \begin{array}{ccc} 3-\lambda & 1 & -1 \\ 1 & 3-\lambda & -1 \\ 3 & 3 & -1-\lambda \end{array} \!\right] \] Next we calculate this determinant: \begin{align*} \left|\! \begin{array}{ccc} 3-\lambda & 1 & -1 \\ 1 & 3-\lambda & -1 \\ 3 & 3 & -1-\lambda \end{array} \!\right| &= \left|\! \begin{array}{ccc} 2-\lambda & -2+\lambda & 0 \\ 1 & 3-\lambda & -1 \\ 3 & 3 & -1-\lambda \end{array} \!\right| \\ &= \left|\! \begin{array}{ccc} 2-\lambda & 0 & 0 \\ 1 & 4-\lambda & -1 \\ 3 & 6 & -1-\lambda \end{array} \!\right| \\ &= (2-\lambda) \bigl( (4-\lambda)(-1-\lambda) + 6 \bigr) \\ & = (2-\lambda)\bigl(\lambda^2 - 3 \lambda + 2\bigr) \\ & = -(\lambda - 2)^2 ( \lambda - 1) \end{align*} (At the first equality sign, we subtracted the second row from the first. At the second equality sign, we added the first column to the second. These operations do not change the value of a determinant.)
- Thus the eigenvalues of the matrix $A$ are $1$ and $2.$
- Next we will find the eigenspace corresponding to the eigenvalue $1.$ For that we need to find the nullspace of the matrix \[ A - 1 I_3 = \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] - \left[\! \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \!\right] = \left[\! \begin{array}{ccc} 2 & 1 & -1 \\ 1 & 2 & -1 \\ 3 & 3 & -2 \end{array} \!\right]. \] So, we row reduce the preceding matrix and find its nullspace: \[ \left[\! \begin{array}{ccc} 2 & 1 & -1 \\ 1 & 2 & -1 \\ 3 & 3 & -2 \end{array} \!\right] \sim \left[\! \begin{array}{ccc} 1 & 2 & -1 \\ 0 & 3 & -1 \\ 0 & 3 & -1 \end{array} \!\right] \sim \left[\! \begin{array}{ccc} 1 & 2 & -1 \\ 0 & 1 & -1/3 \\ 0 & 0 & 0 \end{array} \!\right] \sim \left[\! \begin{array}{ccc} 1 & 0 & -1/3 \\ 0 & 1 & -1/3 \\ 0 & 0 & 0 \end{array} \!\right]. \] Thus, the eigenspace corresponding to the eigenvalue $1$ is the subspace \[ \left\{ \left[\! \begin{array}{c} s/3 \\ s/3 \\ s \end{array} \!\right] \ : \ s \in \mathbb{R} \right\} = \operatorname{Span} \left\{ \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right] \right\}. \] Hence one eigenvector is $\left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right].$
- Next we will find the eigenspace corresponding to the eigenvalue $2.$ For that we need to find the nullspace of the matrix \[ A - 2 I_3 = \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] - \left[\! \begin{array}{rrr} 2 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 2 \end{array} \!\right] = \left[\! \begin{array}{rrr} 1 & 1 & -1 \\ 1 & 1 & -1 \\ 3 & 3 & -3 \end{array} \!\right]. \] So, we row reduce the preceding matrix and find its nullspace: \[ \left[\! \begin{array}{rrr} 1 & 1 & -1 \\ 1 & 1 & -1 \\ 3 & 3 & -3 \end{array} \!\right] \sim \left[\! \begin{array}{rrr} 1 & 1 & -1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array} \!\right]. \] Thus, the eigenspace is the subspace \[ \left\{ \left[\! \begin{array}{c} -s + t \\ s \\ t \end{array} \!\right] \ : \ s, t \in \mathbb{R} \right\} = \operatorname{Span} \left\{ \left[\! \begin{array}{c} -1 \\ 1 \\ 0 \end{array} \!\right], \left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right] \right\}. \] Hence two linearly independent eigenvectors corresponding to the eigenvalue $2$ are $\left[\! \begin{array}{c} -1 \\ 1 \\ 0 \end{array} \!\right]$ and $\left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right].$
- The magic of what we found by now is that we found a basis of $\mathbb{R}^3$ which consists of eigenvectors of $A:$ \[ \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right], \quad \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \!\right], \quad \left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right]. \]
- It is easy to verify whether these are really eigenvectors: \begin{align*} \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right] & = 1 \left[\! \begin{array}{c} 1 \\ 1 \\ 3 \end{array} \!\right], \\ \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right] \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \!\right] & = 2 \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \!\right], \\ \left[\! \begin{array}{rrr} 3 & 1 & -1 \\ 1 & 3 & -1 \\ 3 & 3 & -1 \end{array} \!\right]\left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right] & = 2 \left[\! \begin{array}{c} 1 \\ 0 \\ 1 \end{array} \!\right]. \end{align*} Yes, they are.
- I also made the claim that the three eigenvectors are linearly independent. Let us verify that as well. \[ \left[\! \begin{array}{crc|ccc} 1 & -1 & 1 & 1 & 0 & 0\\ 1 & 1 & 0 & 0 & 1 & 0 \\ 3 & 0 & 1 & 0 & 0 & 1 \end{array} \!\right] \sim \cdots \sim \left[\! \begin{array}{ccc|rrc} 1 & 0 & 0 & -1 & -1 & 1\\ 0 & 1 & 0 & 1 & 2 & -1 \\ 0 & 0 & 1 & 3 & 3 & -2 \end{array} \!\right]. \] We know that the right-hand side matrix in the Reduced Row Echelon Form is the inverse of the matrix whose columns are the eigenvectors. To verify the row reduction above, we calculate: \[ \left[\! \begin{array}{rrr} 1 & -1 & 1 \\ 1 & 1 & 0 \\ 3 & 0 & 1 \end{array} \!\right] \left[\! \begin{array}{rrr} -1 & -1 & 1 \\ 1 & 2 & -1 \\ 3 & 3 & -2 \end{array} \!\right] = \left[\! \begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \!\right]. \]
Example 2. In this item I will illustrate how to calculate eigenvalues and the corresponding eigenspaces of a specific $4\!\times\!4$ matrix. The purpose is to demonstrate a matrix that is not diagonalizable. Consider the matrix \[ A = \left[\! \begin{array}{rrrr} 0 & 0 & -1 & -1 \\ -1 & 0 & 0 & 0 \\ 2 & 1 & 2 & 1 \\ -2 & -1 & -1 & 0 \end{array} \!\right] . \]
- First we find the characteristic polynomial of this matrix. The characteristic polynomial is the determinant of the following matrix: \[ A - \lambda I_4 = \left[\! \begin{array}{rrrr} 0 & 0 & -1 & -1 \\ -1 & 0 & 0 & 0 \\ 2 & 1 & 2 & 1 \\ -2 & -1 & -1 & 0 \end{array} \!\right] - \left[\! \begin{array}{rrrr} \lambda & 0 &0 & 0 \\ 0 & \lambda & 0 & 0 \\ 0 & 0 & \lambda & 0 \\ 0 & 0 & 0 & \lambda \end{array} \!\right] = \left[\! \begin{array}{cccc} -\lambda & 0 & -1 & -1 \\ -1 & -\lambda & 0 & 0 \\ 2 & 1 & 2-\lambda & 1 \\ -2 & -1 & -1 & -\lambda \end{array} \!\right] \] Next we calculate the determinant of the preceding matrix: \begin{align*} \left|\! \begin{array}{cccc} -\lambda & 0 & -1 & -1 \\ -1 & -\lambda & 0 & 0 \\ 2 & 1 & 2-\lambda & 1 \\ -2 & -1 & -1 & -\lambda \end{array} \!\right| & = \left|\!\begin{array}{cccc} -\lambda & \lambda^2 & -1 & -1 \\ -1 & 0 & 0 & 0 \\ 2 & 1 - 2 \lambda & 2-\lambda & 1 \\ -2 & -1 + 2\lambda & -1 & -\lambda \end{array}\!\right| \\[6pt] & = \left|\!\begin{array}{ccc} \lambda^2 & -1 & -1 \\ 1 - 2 \lambda & 2-\lambda & 1 \\ -1 + 2\lambda & -1 & -\lambda \end{array}\!\right| \\[6pt] & = \left|\!\begin{array}{ccc} \lambda^2 & -1 & -1 \\ 1 -2 \lambda & 2-\lambda & 1 \\ 0 & 1-\lambda & 1 -\lambda \end{array}\!\right| \\[6pt] & = (1-\lambda) \left|\!\begin{array}{ccc} \lambda^2 & -1 & -1 \\ 1 -2 \lambda & 2-\lambda & 1 \\ 0 & 1 & 1 \end{array}\!\right| \\[6pt] & = (1-\lambda) \left|\!\begin{array}{ccc} \lambda^2 & 0 & 0 \\ 1 -2 \lambda & 2-\lambda & 1 \\ 0 & 1 & 1 \end{array}\!\right| \\[6pt] & = \lambda^2 (1-\lambda) \left|\!\begin{array}{cc} 2-\lambda & 1 \\ 1 & 1 \end{array}\!\right| \\[6pt] & = \lambda^2 (1-\lambda) (1-\lambda) \end{align*} (At the first equality sign, we subtracted the first column multiplied by $-lambda$ from the second column. At the second equality sign, we perform the cofactor expansion along the second row. At the third equality sign, we add the second row to the third. At the fourth equality sign, we factor out the common factor $(1-\lambda)$ from the third row. At the fifth equality sign, we add the third row to the first. At the sixth equality sign, we perform the cofactor expansion along the first row. At the last equality sign, we calculate the $2\!\times\!2$ determinant.)
- Thus the eigenvalues of the matrix $A$ are $0$ and $1.$ The algebraic multiplicities of both eigenvalues is $2.$ Next we will calculate the geometric multiplicities of these eigenvalues.
- Next we will find the eigenspace corresponding to the eigenvalue $0.$ For that we need to find the nullspace of the matrix \[ A - 0 I_4 = \left[\! \begin{array}{rrrr} 0 & 0 & -1 & -1 \\ -1 & 0 & 0 & 0 \\ 2 & 1 & 2 & 1 \\ -2 & -1 & -1 & 0 \end{array} \!\right] \]
- So, we row reduce the matrix $A:$ \[ \left[\! \begin{array}{rrrr} 0 & 0 & -1 & -1 \\ -1 & 0 & 0 & 0 \\ 2 & 1 & 2 & 1 \\ -2 & -1 & -1 & 0 \end{array} \!\right] \sim \cdots \sim \left[\! \begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & -1 \\ 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 \end{array} \!\right] \] The nullspace of the preceding matrix is the eigenspace corresponding to the eigenvalue $0$; which we calculate to be \[ \left\{ \left[\! \begin{array}{c} 0 \\ s \\ -s \\ s \end{array} \!\right] \ : \ s \in \mathbb{R} \right\} = \operatorname{Span} \left\{ \left[\! \begin{array}{r} 0 \\ 1 \\ -1 \\ 1 \end{array} \!\right] \right\}. \]
- Thus, the eigenspace corresponding to the eigenvalue $0$ is one-dimensional.
- Next we will find the eigenspace corresponding to the eigenvalue $1.$ For that we need to find the nullspace of the matrix \[ A - 1 I_4 = \left[\! \begin{array}{rrrr} 0 & 0 & -1 & -1 \\ -1 & 0 & 0 & 0 \\ 2 & 1 & 2 & 1 \\ -2 & -1 & -1 & 0 \end{array} \!\right] - \left[\! \begin{array}{rrrr} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array} \!\right] = \left[\! \begin{array}{rrrr} -1 & 0 & -1 & -1 \\ -1 & -1 & 0 & 0 \\ 2 & 1 & 1 & 1 \\ -2 & -1 & -1 & -1 \end{array} \!\right] \]
- So, we row reduce the preceding matrix: \[ \left[\! \begin{array}{rrrr} -1 & 0 & -1 & -1 \\ -1 & -1 & 0 & 0 \\ 2 & 1 & 1 & 1 \\ -2 & -1 & -1 & -1 \end{array} \!\right] \sim \cdots \sim \left[\! \begin{array}{rrrr} 1 & 0 & 1 & 1 \\ 0 & 1 & -1 & -1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array} \!\right] \] Thus, the eigenspace corresponding to the eigenvalue $1$ is the subspace \[ \left\{ \left[\! \begin{array}{c} -s-t \\ s+t \\ s \\ t \end{array} \!\right] \ : \ s, t \in \mathbb{R} \right\} = \operatorname{Span} \left\{ \left[\! \begin{array}{r} -1 \\ 1 \\ 1 \\ 0 \end{array} \!\right], \left[\! \begin{array}{r} -1 \\ 1 \\ 0 \\ 1 \end{array} \!\right] \right\}. \]
- Thus, the eigenspace corresponding to the eigenvalue $1$ is two-dimensional.
- Since we found all eigenspaces of the $4\!\times\!4$ matrix $A$ and these eigenspaces have dimensions $1$ and $2$, we conclude that we can have at most three linearly independent eigenvectors. Consequently, we can not have a basis for $\mathbb R^4$ which consists of eigenvectors of $A.$ This shows directly that the matrix $A$ is not diagonalizable.

Tuesday, February 22, 2022

We discussed Section 5.1. Suggested problems for Section 5.1: 1, 3, 4, 5, 6, 8, 11, 15, 16, 17, 19, 20, 24-27, 29, 30, 31.
A related Wikipedia link: Eigenvalue, eigenvector and eigenspace.
Below are animations of different matrices in action. In each scene the navy blue vector is the image of the sea green vector under the multiplication by a matrix $A$. For easier visualization of the action the heads of vectors leave traces.
Just looking at the movies you can guess what are the eigenvalues and eigenvectors of the featured matrix. In particular it is easy to see whether an eigenvalue is positive, negative, zero, or a complex number, ... You can also approximately calculate which matrix is featured in each movie.

Place the cursor over the image to start the animation.

Friday, February 18, 2022

Before I present the method of finding the inverse of an invertible matrix using cofactors, I point out that a more efficient way to find the inverse of an invertible $3\!\times\!3$ matrix is the method presented on February 3; see the item starting with the sentence:
To calculate the inverse $A^{-1}$ we row reduce the $3\times 6$ matrix $[A | I_3]$:
Let \[ A = \left[\! \begin{array}{cc} a & b \\ c & d \end{array} \!\right]. \] Then without any conditions on the matrix $A$ we have \[ \left[\! \begin{array}{cc} a & b \\ c & d \end{array} \!\right] \left[\! \begin{array}{rr} d & -b \\ -c & a \end{array} \!\right] = (ad-bc) \left[\! \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \!\right]. \] Therefore $A$ is invertable if and only if $\det(A) = ad-bc \neq 0.$ If $\det(A) = ad-bc \neq 0,$ then \[ A^{-1} = \frac{1}{ad-bc} \left[\! \begin{array}{rr} d & -b \\ - c & a \end{array} \!\right] = \frac{1}{ad-bc} \left[\! \begin{array}{rr} d & -c \\ -b & a \end{array} \!\right]^{\top} \]
It is truly amazing that the same pattern holds for a $3\!\times\!3$ matrix. We just have to take into account that the definition of the determinant for a $3\!\times\!3$ matrix is more complicated and the determinant can be expressed as three different looking cofactor expensions.
Let \[ A = \left[\! \begin{array}{rrr} a & b & c \\ d & e & f \\ g & h & i \end{array} \!\right]. \] Recall that \begin{align*} \require{bbox} \det(A) & = a \left(\bbox[yellow]{\left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right|}\right) + b \left(\bbox[yellow]{-\left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right|}\right) + c \left(\bbox[yellow]{\left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|}\right) \\ & = d \left(\bbox[cyan]{-\left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right|} \right) + e \left(\bbox[cyan]{\left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right|}\right) + f \left(\bbox[cyan]{-\left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|}\right) \\ & = g \left(\bbox[#FF88FF]{\left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right|}\right) + h \left(\bbox[#FF88FF]{-\left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right|}\right) + i \left(\bbox[#FF88FF]{\left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|}\right) \\ & = aei + bfg +cdh - afh - bdi - ceg. \end{align*}
Then without any assumptions on the $3\!\times\!3$ matrix $A$ we have \[ \require{bbox} \left[\! \begin{array}{rrr} a & b & c \\ d & e & f \\ g & h & i \end{array} \!\right] \left[\! \begin{array}{rrr} \bbox[yellow]{\left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right|} & \bbox[cyan]{ -\left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{ -\left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right|} & \bbox[cyan]{ \left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right|} & \bbox[#FF88FF]{ -\left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{ \left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} & \bbox[cyan]{ -\left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \end{array} \!\right] = \] (here we use the definition of matrix multiplication to calculate the product) \[ = \require{bbox} \left[\! \begin{array}{ccc} \bbox[yellow]{ a \left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right| - b \left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right| + c \left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} \ & \bbox[cyan]{ - a \left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right| + b \left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right| - c \left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} \ & \bbox[#FF88FF]{ a \left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right| - b \left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right| + c \left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \\[14pt] \bbox[yellow]{ d \left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right| - e \left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right| + f \left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} \ & \bbox[cyan]{ - d \left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right| + e \left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right| - f \left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} \ & \bbox[#FF88FF]{ d \left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right| - e \left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right| + f \left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \\[14pt] \bbox[yellow]{ g \left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right| - h \left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right| + i \left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} \ & \bbox[cyan]{ - g \left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right| + h \left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right| - i \left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} \ & \bbox[#FF88FF]{ g \left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right| - h \left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right| + i \left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \end{array} \!\right] = \] (now we notice that each colored entry in the preceding matrix is a cofactor expension of a $3\!\times\!3$ determinant; so each colored entry we write as a determinant; so the entry of the preceding matrix are $3\!\times\!3$ determinants) \[ = \left[\! \begin{array}{ccc} \bbox[yellow]{\left|\! \begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array} \!\right|} & \bbox[cyan]{\left|\! \begin{array}{ccc} a & b & c \\ a & b & c \\ g & h & i \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{ccc} a & b & c \\ d & e & f \\ a & b & c \end{array} \!\right|} \\[6pt] \bbox[yellow]{ \left|\! \begin{array}{ccc} d & e & f \\ d & e & f \\ g & h & i \end{array} \!\right| } & \bbox[cyan]{\left|\! \begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{ccc} a & b & c \\ d & e & f \\ d & e & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{\left|\! \begin{array}{ccc} g & h & i \\ d & e & f \\ g & h & i \end{array} \!\right|} & \bbox[cyan]{ \left|\! \begin{array}{ccc} a & b & c \\ g & h & i \\ g & h & i \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \end{array} \!\right| } \end{array} \!\right] = \] (now looking carefully at each of nine determinants, we see that on the diagonal we have $\det(A)$ and off-diagonal we have determinants with two identical rows, so these are equal to $0$) \[ =\left[\! \begin{array}{ccc} \bbox[yellow]{ \det(A)} & \bbox[cyan]{0} & \bbox[#FF88FF]{0} \\[6pt] \bbox[yellow]{ 0} & \bbox[cyan]{\det(A)} & \bbox[#FF88FF]{0} \\[6pt] \bbox[yellow]{0} & \bbox[cyan]{0} & \bbox[#FF88FF]{\det(A)} \end{array} \!\right] = \det(A) \left[\! \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \!\right]. \]
In conclusion, without any conditions on the $3\!\times\!3$ matrix $A$ we have \[ \require{bbox} \left[\! \begin{array}{rrr} a & b & c \\ d & e & f \\ g & h & i \end{array} \!\right] \left[\! \begin{array}{rrr} \bbox[yellow]{\left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right|} & \bbox[cyan]{ -\left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{ -\left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right|} & \bbox[cyan]{ \left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right|} & \bbox[#FF88FF]{ -\left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{ \left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} & \bbox[cyan]{ -\left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \end{array} \!\right] = \det(A) \left[\! \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array} \!\right]. \]
The preceding formula shows that $A$ is invertible if and only if $\det(A) \neq 0.$ If $\det(A) \neq 0,$ then \[ A^{-1} = \frac{1}{\det(A)} \left[\! \begin{array}{rrr} \bbox[yellow]{\left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right|} & \bbox[cyan]{ -\left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{ -\left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right|} & \bbox[cyan]{ \left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right|} & \bbox[#FF88FF]{ -\left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right|} \\[6pt] \bbox[yellow]{ \left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} & \bbox[cyan]{ -\left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} & \bbox[#FF88FF]{ \left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \end{array} \!\right]. \] Notice that in the matrix on the right we have cofactors of the matrix $A.$
The preceding formula for the inverse is commonly written with the transpose of the matrix involving the cofactors. \[ A^{-1} = \frac{1}{\det(A)} \left[\! \begin{array}{rrr} \bbox[yellow]{\left|\! \begin{array}{cc} e & f \\ h & i \end{array} \!\right|} & \bbox[yellow]{-\left|\! \begin{array}{cc} d & f \\ g & i \end{array} \!\right|} & \bbox[yellow]{\left|\! \begin{array}{cc} d & e \\ g & h \end{array} \!\right|} \\ \bbox[cyan]{-\left|\! \begin{array}{cc} b & c \\ h & i \end{array} \!\right|} & \bbox[cyan]{\left|\! \begin{array}{cc} a & c \\ g & i \end{array} \!\right|} & \bbox[cyan]{-\left|\! \begin{array}{cc} a & b \\ g & h \end{array} \!\right|} \\ \bbox[#FF88FF]{\left|\! \begin{array}{cc} b & c \\ e & f \end{array} \!\right|} & \bbox[#FF88FF]{-\left|\! \begin{array}{cc} a & c \\ d & f \end{array} \!\right|} & \bbox[#FF88FF]{\left|\! \begin{array}{cc} a & b \\ d & e \end{array} \!\right|} \end{array} \!\right]^{\mathbf{\top}}. \] Please recall the definition of the cofactors of $A$. With the definition of the cofactors the above formula for the inverse is simply \[ A^{-1} = \frac{1}{\det(A)} \left[\! \begin{array}{rrr} \bbox[yellow]{C_{11}} & \bbox[yellow]{C_{12}} & \bbox[yellow]{C_{13}} \\ \bbox[cyan]{C_{21}} & \bbox[cyan]{C_{22}} & \bbox[cyan]{C_{23}} \\ \bbox[#FF88FF]{C_{31}} & \bbox[#FF88FF]{C_{32}} & \bbox[#FF88FF]{C_{33}} \end{array} \!\right]^{\mathbf{\top}}. \]
The pattern presented above for a $3\!\times\!3$ matrix holds for any $n\!\times\!n$ matrix: \[ \left[\! \begin{array}{cccc} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{array} \!\right] \left[\! \begin{array}{cccc} C_{11} & C_{21} & \cdots & C_{n1} \\ C_{12} & C_{22} & \cdots & C_{n2} \\ \vdots & \vdots & \ddots & \vdots \\ C_{1n} & C_{2n} & \cdots & C_{nn} \end{array} \!\right] = \det(A) I_n. \] A justification for this identity is the definition of determinant using the cofactor expansions along the rows of $A$ and the fact that a determinant with two identical rows is equal to $0.$
Let us calculate the inverse of a specific $3\!\times\!3$ matrix. Consider \[ A = \left[\! \begin{array}{ccc} 1 & 2 & 6 \\ 3 & 5 & 7 \\ 4 & 8 & 9 \end{array} \!\right]. \] Let us first evaluate the determinant of this matrix in three different ways, by performing the cofactor expansion along each column: \begin{align*} \det A & = 1*(\bbox[yellow]{+(-11)})+2*(\bbox[yellow]{-(-1)})+6*(\bbox[yellow]{+(4)}) \\ & = 3*(\bbox[cyan]{-(-30)})+5*(\bbox[cyan]{+(-15)})+7*(\bbox[cyan]{-(0)}) \\ & = 4*(\bbox[#FF88FF]{+(-16)})+8*(\bbox[#FF88FF]{-(-11)})+9*(\bbox[#FF88FF]{+(-1)}) \\ & = 15. \end{align*} Since we are doing this three different ways, there is a lot of checking going on in the above calculations.
Based on the three cofactor expansions above we can write the inverse of the given matrix $A:$ \[ A^{-1} = \frac{1}{15} \left[\! \begin{array}{rrr} \bbox[yellow]{-11} & \bbox[cyan]{30} & \bbox[#FF88FF]{-16} \\ \bbox[yellow]{1} & \bbox[cyan]{-15} & \bbox[#FF88FF]{11} \\ \bbox[yellow]{4} & \bbox[cyan]{0} & \bbox[#FF88FF]{-1} \end{array} \!\right]. \] To verify, calculate \[ \left[\! \begin{array}{ccc} 1 & 2 & 6 \\ 3 & 5 & 7 \\ 4 & 8 & 9 \end{array} \!\right] \left[\! \begin{array}{rrr} \bbox[yellow]{-11} & \bbox[cyan]{30} & \bbox[#FF88FF]{-16} \\ \bbox[yellow]{1} & \bbox[cyan]{-15} & \bbox[#FF88FF]{11} \\ \bbox[yellow]{4} & \bbox[cyan]{0} & \bbox[#FF88FF]{-1} \end{array} \!\right] = \left[\! \begin{array}{rrr} \bbox[yellow]{15} & \bbox[cyan]{0} & \bbox[#FF88FF]{0} \\ \bbox[yellow]{0} & \bbox[cyan]{15} & \bbox[#FF88FF]{0} \\ \bbox[yellow]{0} & \bbox[cyan]{0} & \bbox[#FF88FF]{15} \end{array} \!\right] \]
Suggested problems for Section 3.3: 2, 4, 6, 7, 9, 13, 14, 18, 19-24, 27, 30, 31, 32.

Thursday, February 17, 2022

Suggested problems for Section 3.2: 5, 7, 9, 11, 16-20 (even), 21, 25, 31, 33, 34, 35, 40c

Wednesday, February 16, 2022

In this file you can find some problems in the form that they could appear on the exam. Some of the problems in this sample have too many items to be on an exam. On the exam I will assign at most three items per problem.

Tuesday, February 15, 2022

I updated the calendar.
Suggested problems for Section 3.1: 1, 3, 5, 9, 11, 13, 17, 20, 21, 22, 25-32, 37, 41, 42
Below is a "click-by-click" proof of the fact that the determinant of the elementary matrix obtained from the identity matrix by exchanging two rows (or, equivalently two columns) equals to $-1.$ There are nine steps in this proof. I describe each step below.
1. This is the determinant that we want to calculate.
2. I emphasize that the $i$-th and $j$-th row in the identity matrix are interchanged.
3. We will calculate the $n\!\times\!n$ determinant by cofactor expansion along the $i$-th row.
4. Since the only nonzero entry in the $i$-th row is at the $j$-th place, the cofactor expansion equals $(-1)^{i+j}$ multiplied by an $(n-1)\times(n-1)$ determinant.
5. We will calculate the $(n-1)\!\times\!(n-1)$ determinant using cofactor expansion along the $(j-1)$-st row. Notice that the only nonzero entry in the $(j-1)$-st row is $1$ which is at $i$-th position.
6. The previous $(n-1)\!\times\!(n-1)$ determinant calculates to $(-1)^{i+j-1}$ multiplied by the $(n-2)\!\times\!(n-2)$ determinant of the identity matrix.
7. The $(n-2)\!\times\!(n-2)$ determinant of the identity matrix calculates to $1$.
8. $(-1)^{i+j}(-1)^{i+j-1} = (-1)^{2i+2j-1}$.
9. $(-1)^{2i+2j-1} = -1$.
All entries left blank in the determinant below are zeros.
Click on the image for a step by step proof.
A different way to prove that the determinant of the elementary matrix obtained from the identity matrix by exchanging two rows (or, equivalently two columns) equals to $-1$ is by first observing the following fact

Let $n$ be a positive integer greater than $2$ and let $k\in\{1,2,\ldots,n\}.$ Let $A$ be an $n\!\times\!n$ matrix in which $k$-th row ($k$-th column, respectively) is the $k$-th row ($k$-th column, respectively) of the identity matrix $I_n.$ Then the determinant of $A$ equals to the determinant of the $(n-1)\!\times\!(n-1)$ matrix obtained from $A$ by removing its $k$-th row and its $k$-th column.

This fact follows from the cofactor expansion calculation of a determinant, Theorem 1 in Section 3.1.
For example, with the third row being the third row of the identity matrix $I_4:$ \[ \left| \begin{array}{cccc} a & b & c & d \\ e & f & g & h \\ 0 & 0 & 1 & 0 \\ i & j & k & l \end{array} \right| = \left| \begin{array}{ccc} a & b & d \\ e & f & h \\ i & j & l \end{array} \right|. \] This procedure is repeatable as many times as many rows or columns of the identity matrix we have in a square matrix. That is

Let $n$ be a positive integer greater than $2,$ $l \in \{1,2,\ldots,n\}$ and let $k_1,\ldots,k_l\in\{1,2,\ldots,n\}$ be distinct. Let $A$ be an $n\!\times\!n$ matrix with the following property:

For every $j \in \{1,\ldots,l\}$ we have that $k_j$-th row is the $k_j$-th row of the identity matrix $I_n$ OR $k_j$-th column is the $k_j$-th column of the identity matrix $I_n$.
Then the determinant of $A$ equals to the determinant of the $(n-l)\!\times\!(n-l)$ matrix obtained from $A$ by removing all of its rows with indexes $k_1,\ldots,k_l$ and all of its columns with indexes $k_1,\ldots,k_l.$

For example, with the third row being the third row of the identity matrix $I_6$ and the fifth column being the fifth column of the identity matrix $I_6:$ (here $l = 2$ and $k_1 = 3, k_2 = 5$) \[ \left| \begin{array}{cccccc} a & b & c & d & 0 & e \\ f & g & h & i & 0 & j \\ 0 & 0 & 1 & 0 & 0 & 0 \\ k & l & m & n & 0 & o \\ p & q & r & s & 1 & t \\ u & v & w & x & 0 & y \end{array} \right| = \left| \begin{array}{cccc} a & b & d & e \\ f & g & i & j \\ k & l & n & o \\ u & v & x & y \end{array} \right|. \] or, with the third row being the third row of the identity matrix $I_6$ and the second column being the second column of the identity matrix $I_6$ and the fifth column being the fifth column of the identity matrix $I_6:$ (here $l = 3$ and $k_1 = 2, k_2 = 3, k_3 = 5$) \[ \left| \begin{array}{cccccc} a & 0 & b & c & 0 & d \\ e & 1 & f & g & 0 & h \\ 0 & 0 & 1 & 0 & 0 & 0 \\ i & 0 & j & k & 0 & l \\ m & 0 & n & o & 1 & p \\ q & 0 & r & s & 0 & t \end{array} \right| = \left| \begin{array}{ccc} a & c & d \\ i & k & l \\ q & s & t \end{array} \right|. \] Let $n$ be a positive integer greater than $2$ and $i,j \in \{1,2,\ldots,n\}$ are such that $i \lt j.$ For the $n\!\times\!n$ elementary matrix in which $i$-th and $j$-th rows exchanged places in the identity matrix $I_n$, then all the rows and the columns with indexes different from $i$ and $j$ can be removed. Thus, the determinant of this elementary matrix equals \[ \left| \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array} \right| = -1. \] For example, at each step we remove a row and a column which comes from the corresponding identity matrix: (the first row and column, then again the first row and column, then the second row and column, then the third row and column) \[ \left| \begin{array}{cccccc} 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \end{array} \right| = \left| \begin{array}{ccccc} 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 \end{array} \right| = \left| \begin{array}{cccc} 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{array} \right| = \left| \begin{array}{ccc} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array} \right| = \left| \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array} \right| = -1. \]
Look at Wikipedia's Matrix page. It contains some stuff that we did and much that we didn't do.

Monday, February 14, 2022

Today in class we explored a problem inspired by the picture below. No numbers are given just the picture. In the picture below we are given two bases of $\mathbb{R}^2$, one blue and one purple: \[ \color{blue}{\mathcal B} = \bigl\{ \color{blue}{\mathbf{b}_1}, \color{blue}{\mathbf{b}_2} \bigr\}, \quad \color{purple}{\mathcal C} = \bigl\{ \color{purple}{\mathbf{c}_1}, \color{purple}{\mathbf{c}_2} \bigr\} \] For each basis a coordinate grid is shown in the corresponding color. The coordinate grids are drown in the increments of 1/10 with the multiples of 1/2 emphasized with slightly thicker lines. Based on the information provided in the picture give good estimates for the change of coordinates matrices: \[ \underset{\color{purple}{\mathcal C}\leftarrow\color{blue}{\mathcal{B}}}{P}, \qquad \underset{\color{blue}{\mathcal{B}}\leftarrow\color{purple}{\mathcal C}}{P}. \] Why are the red points in the picture exceptional? How you can use the red points to verify whether your change of coordinates matrices are correct?
Suggested exercises for Section 4.7: Change of Basis are 2, 3, 4, 6, 8, 9, 11, 12, 19, 20.
A brief review of the Change of Coordinates Matrix (this is my preferred name) follows. Let $m, n \in \mathbb{N}$ and $m\leq n$. Let $\mathcal{H}$ be a subspace of $\mathbb{R}^n$ and let \[ \mathcal{A} = \bigl\{\mathbf{a}_1,\ldots,\mathbf{a}_m\bigr\} \] and \[ \mathcal{B} = \bigl\{\mathbf{b}_1,\ldots,\mathbf{b}_m\bigr\} \] be two bases of $\mathcal{H}.$ By definition of a basis this implies that \[ \mathcal{H} = \operatorname{Span}\bigl\{\mathbf{a}_1,\ldots,\mathbf{a}_m\bigr\} = \operatorname{Span}\bigl\{\mathbf{b}_1,\ldots,\mathbf{b}_m\bigr\} \] and both \[ \mathcal{A} = \bigl\{\mathbf{a}_1,\ldots,\mathbf{a}_m\bigr\} \quad \text{and} \quad \mathcal{B} = \bigl\{\mathbf{b}_1,\ldots,\mathbf{b}_m\bigr\} \] are linearly independent. We proved in class that the change of coordinates matrix $\displaystyle\underset{\mathcal{B}\leftarrow\mathcal{A}}{P}$ is given by \[ \underset{\mathcal{B}\leftarrow\mathcal{A}}{P} = \Bigl[ \bigl[\mathbf{a}_1\bigr]_{\mathcal{B}} \ \cdots \ \bigl[ \mathbf{a}_m\bigr]_{\mathcal{B}} \Bigr] \] and analogously \[ \underset{\mathcal{A}\leftarrow\mathcal{B}}{P} = \Bigl[ \bigl[\mathbf{b}_1\bigr]_{\mathcal{A}} \ \cdots \ \bigl[ \mathbf{b}_m\bigr]_{\mathcal{A}} \Bigr]. \] But, how to calculate \[ \bigl[\mathbf{a}_1\bigr]_{\mathcal{B}}, \ldots,\bigl[ \mathbf{a}_m\bigr]_{\mathcal{B}}? \] Let us look at \[ \bigl[\mathbf{a}_1\bigr]_{\mathcal{B}} = \left[\!\begin{array}{c} x_1 \\ \vdots \\ x_m \end{array}\!\right]. \] To find the real numbers $x_1, \ldots, x_m$ we have to solve the nonhomogeneous vector equation \[ x_1 \mathbf{b}_1 + x_2 \mathbf{b}_2 + \cdots + x_m \mathbf{b}_m = \mathbf{a}_1. \] To solve the preceding equation we row reduce \[ \Bigl[\!\begin{array}{cccc|c} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_m & \mathbf{a}_1\end{array}\!\Bigr]. \] Since the vectors $\mathbf{b}_1, \mathbf{b}_2, \ldots, \mathbf{b}_m$ are linearly independent, the Reduced Row Echelon Form of the preceding augmented matrix has the following form \[ \left[\!\begin{array}{cccc|c} 1 & 0 & \cdots & 0 & \text{the solution for} \ x_1 \\ 0 & 1 & \cdots & 0 & \text{the solution for} \ x_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 1 & \text{the solution for} \ x_m \\ 0 & 0 & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 0 & 0 \end{array}\!\right]. \] Notice that in the preceding matrix the bottom zero rows are present only in the case when $n \gt m.$ If $n \gt m$, then there are exactly $n-m$ rows of zeros. Also notice that the above system must be consistent since the vector $\mathbf{a}_1$ is in the span of the vectors $\mathbf{b}_1, \mathbf{b}_2, \ldots, \mathbf{b}_m.$
To solve the nonhomogeneous vector equations \[ x_1 \mathbf{b}_1 + x_2 \mathbf{b}_2 + \cdots + x_m \mathbf{b}_m = \mathbf{a}_2, \quad \ldots, \quad x_1 \mathbf{b}_1 + x_2 \mathbf{b}_2 + \cdots + x_m \mathbf{b}_m = \mathbf{a}_m, \] we just build the bigger augmented matrix: \[ \Bigl[\!\begin{array}{cccc|cccc} \mathbf{b}_1 & \mathbf{b}_2 & \cdots & \mathbf{b}_m & \mathbf{a}_1 & \mathbf{a}_2 & \cdots & \mathbf{a}_m \end{array}\!\Bigr]. \] Since the vectors $\mathbf{b}_1, \ldots, \mathbf{b}_m$ are linearly independent, the RREF off the matrix whose columns are the vectors of $\mathcal{B}$ consists of the identity matrix $I_m$ and $n-m$ zero rows at the bottom if $n\gt m.$ Therefore \[ \Bigl[\!\begin{array}{ccc|ccc} \mathbf{b}_1 & \cdots & \mathbf{b}_m & \mathbf{a}_1 & \cdots & \mathbf{a}_m \end{array}\!\Bigr] \sim \cdots \sim \left[\! \begin{array}{c|c} I_m & \underset{\mathcal{B}\leftarrow\mathcal{A}}{P} \\ 0 & 0 \end{array} \!\right]. \] In the preceding RREF, the zero matrices at the bottom are present only if $n-m \gt 0.$ Then, if $n-m \gt 0,$ these matrices are of the size $(n-m)\!\times\!m;$ they both have $m$ columns and $n-m$ rows consisting of zeros.
In the next example we are given two bases of a two-dimensional subspace of $\mathbb{R}^4$ and we are asked to find a change of coordinate matrices between these two bases: \[ \mathcal{H} = \operatorname{Span}\left\{\left[\!\begin{array}{c} 1 \\ 2 \\ 1 \\ 3 \end{array}\!\right], \left[\!\begin{array}{c} 2 \\ 3 \\ 1 \\ 5 \end{array}\!\right]\right\} = \operatorname{Span}\left\{\left[\!\begin{array}{c} 2 \\ 5 \\ 3 \\ 7 \end{array}\!\right], \left[\!\begin{array}{r} 1 \\ 0 \\ -1 \\ 1 \end{array}\!\right]\right\}. \] Set \[ \mathcal{A} = \left\{\left[\!\begin{array}{c} 1 \\ 2 \\ 1 \\ 3 \end{array}\!\right], \left[\!\begin{array}{c} 2 \\ 3 \\ 1 \\ 5 \end{array}\!\right]\right\}, \qquad \mathcal{B} = \left\{\left[\!\begin{array}{c} 2 \\ 5 \\ 3 \\ 7 \end{array}\!\right], \left[\!\begin{array}{r} 1 \\ 0 \\ -1 \\ 1 \end{array}\!\right]\right\}. \] To calculate $\displaystyle\underset{\mathcal{B}\leftarrow\mathcal{A}}{P}$ we need to row reduce the matrix \[ \left[\!\begin{array}{cr|cc} 2 & 1 & 1 & 2 \\ 5 & 0 & 2 & 3 \\ 3 & -1 & 1 & 1 \\ 7 & 1 & 3 & 5 \end{array}\!\right] \] The RREF of the preceding matrix will certainly include fractions. Therefore we rather find $\displaystyle\underset{\mathcal{A}\leftarrow\mathcal{B}}{P}$ for which we need to row reduce (without fractions) \[ \left[\!\begin{array}{cc|cr} 1 & 2 & 2 & 1 \\ 2 & 3 & 5 & 0 \\ 1 & 1 & 3 & -1 \\ 3 & 5 & 7 & 1 \end{array}\!\right] \sim \cdots \sim \left[\!\begin{array}{cc|rr} 1 & 0 & 4 & -3 \\ 0 & 1 & -1 & 2 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{array}\!\right]. \] Hence \[ \underset{\mathcal{A}\leftarrow\mathcal{B}}{P} = \left[\!\begin{array}{rr} 4 & -3 \\ -1 & 2 \end{array}\!\right]. \] Let us verify this calculation. Is it true that: \[ \left[\!\begin{array}{c} 2 \\ 5 \\ 3 \\ 7 \end{array}\!\right] = (4) \left[\!\begin{array}{c} 1 \\ 2 \\ 1 \\ 3 \end{array}\!\right] + (-1)\left[\!\begin{array}{c} 2 \\ 3 \\ 1 \\ 5 \end{array}\!\right], \qquad \left[\!\begin{array}{r} 1 \\ 0 \\ -1 \\ 1 \end{array}\!\right] = (-3) \left[\!\begin{array}{c} 1 \\ 2 \\ 1 \\ 3 \end{array}\!\right] + (2) \left[\!\begin{array}{c} 2 \\ 3 \\ 1 \\ 5 \end{array}\!\right]? \] Yes. Therefore the following equalities are correct: \[ \bigl[ \mathbf{b}_1\bigr]_{\mathcal{A}} = \left[\!\begin{array}{r} 4 \\ -1 \end{array}\!\right], \qquad \bigl[ \mathbf{b}_2\bigr]_{\mathcal{A}} = \left[\!\begin{array}{r} -3 \\ 2 \end{array}\!\right]. \] Hence, $\displaystyle\underset{\mathcal{A}\leftarrow\mathcal{B}}{P}$ is correct.

We have \[ \underset{\mathcal{B}\leftarrow\mathcal{A}}{P} = \left(\underset{\mathcal{A}\leftarrow\mathcal{B}}{P}\right)^{-1} = \frac{1}{5} \left[\!\begin{array}{rr} 2 & 3 \\ 1 & 4 \end{array}\!\right]. \] Verify this: \[ \left[\!\begin{array}{c} 1 \\ 2 \\ 1 \\ 3 \end{array}\!\right] = \frac{2}{5}\left[\!\begin{array}{c} 2 \\ 5 \\ 3 \\ 7 \end{array}\!\right] + \frac{1}{5} \left[\!\begin{array}{r} 1 \\ 0 \\ -1 \\ 1 \end{array}\!\right], \qquad \left[\!\begin{array}{c} 2 \\ 3 \\ 1 \\ 5 \end{array}\!\right] = \frac{3}{5} \left[\!\begin{array}{c} 2 \\ 5 \\ 3 \\ 7 \end{array}\!\right] + \frac{4}{5} \left[\!\begin{array}{r} 1 \\ 0 \\ -1 \\ 1 \end{array}\!\right]. \] True. Therefore the following equalities are correct: \[ \bigl[ \mathbf{a}_1\bigr]_{\mathcal{B}} = \frac{1}{5}\left[\!\begin{array}{r} 2 \\ 1 \end{array}\!\right], \qquad \bigl[ \mathbf{a}_2\bigr]_{\mathcal{B}} = \frac{1}{5} \left[\!\begin{array}{r} 3 \\ 4 \end{array}\!\right]. \] Hence, $\displaystyle\underset{\mathcal{B}\leftarrow\mathcal{A}}{P}$ is correct.

Friday, February 11, 2022 (updated)

Let $m, n \in \mathbb{N}.$ Let $A$ be an $m\!\times\!n$ matrix. The matrix $A$ has $m$ rows and $n$ columns. The columns of $A$ are vectors in $\mathbb{R}^m.$ The transpose of $A$, denoted by $A^\top,$ is an $n\!\times\!m$ matrix. The matrix $A^\top$ has $n$ rows and $m$ columns. The columns of $A^\top$ are vectors in $\mathbb{R}^n.$ The columns of $A^\top$ have the identical entries as the rows of $A,$ just instead of being in rows they are written as columns. The four fundamental subspaces associated with the matrix $A$ are \begin{alignat*}{2} \operatorname{Col}(A) & = \operatorname{Row}(A^\top) \quad & & \left\{ \begin{array}{l} \text{this space is the span of the columns of} \ \ A \\ \text{this space is a subspace of} \ \ \mathbb{R}^m, \ \text{since each column of $A$ is a vector in} \ \mathbb{R}^m \end{array} \right. \\ \operatorname{Row}(A) &= \operatorname{Col}(A^\top) \quad & & \left\{ \begin{array}{l} \text{this space is the span of the columns of} \ A^\top \\ \text{this space is a subspace of} \ \ \mathbb{R}^n, \ \text{since each column of $A^\top$ is a vector in} \ \mathbb{R}^n \end{array} \right. \\ \operatorname{Nul}(A) & = \bigl\{ \mathbf{x} \in \mathbb{R}^n \, : \, & & A\mathbf{x} = \mathbf{0}_m \bigr\} \quad \text{this space is a subspace of} \ \ \mathbb{R}^n, \\ \operatorname{Nul}(A^\top) & = \bigl\{ \mathbf{y} \in \mathbb{R}^m \, : \, & & A^\top\mathbf{y} = \mathbf{0}_n \bigr\} \quad \text{this space is a subspace of} \ \ \mathbb{R}^m \end{alignat*} Thus \begin{equation*} \bbox[lightgreen, 6px, border:3px solid green]{ \begin{array}{cll} \operatorname{Col}(A) = \operatorname{Row}(A^\top) \quad & \text{and} \quad \operatorname{Nul}(A^\top) \quad & \text{are subspaces of} \quad \mathbb{R}^m \\ \operatorname{Row}(A) = \operatorname{Col}(A^\top) \quad & \text{and} \quad \operatorname{Nul}(A) \quad & \text{are subspaces of} \quad \mathbb{R}^n \end{array} } \end{equation*}
Suggested problems related to $\operatorname{Col}(A)$ and $\operatorname{Nul}(A)$ of a given matrix $A$ from Section 4.5 are: 3, 6, 12, 13, 15, 18.
More problems about $\operatorname{Col}(A)$, $\operatorname{Row}(A)$ and $\operatorname{Nul}(A)$ of a given matrix $A$ are in Section 4.6: 3, 4, 5, 6, 7, 8, 9, 11, 13, 15, 17, 18, 27, 28, 29.

Thursday, February 10, 2022

Please enjoy my
An Ode to Reduced Row Echelon Form
We also discussed Section 2.9 Dimension and Rank. Suggested problems for Section 2.9: 1, 2, 3, 5, 6, 7, 9, 11, 12, 14, 15, 19, 20, 21, 23, 24, and in particular 27 and 28. I will post the proof that I did today in class over the weekend.
In Section 2.8 we introduced the concept of column space of a matrix. Today we discussed the concept of row space of a matrix. You can read about the row space of a matrix in Section 4.6. There is a subsection entitled The Row Space. You can learn how to find a basis of the row space in An Ode to Reduced Row Echelon Form.

The rank theorem is covered both in Section 2.9 in Dimension of a Subspace subsection and in the subsection The Rank Theorem in Section 4.6. Read both.

Suggested problems for Section 4.6: 3-9, 11, 13, 15, 17.
Below I show two examples of coordinate systems. The first one is in $\mathbb{R}^2$ and the second one is a coordinate system in a two dimensional subspace of $\mathbb{R}^3.$

Wednesday, February 9, 2022

To understand how to find a basis of a column space study the post on February 2.

Monday, February 7, 2022

Today we discussed parts of Section 2.8 Subspaces of $\mathbb{R}^n.$ Suggested problems for Section 2.8: 5, 6, 8, 9, 10, 11-20, 24, 25, 26, 30, 31-36.

Saturday, February 5, 2022

Yesterday's topic was the Invertible Matrix Theorem in Section 2.3 Characterizations of invertible matrices. It is a good idea to review Problems 23, 24, 25 from Section 2.1 before reading this section. Suggested problems for Section 2.3: 1, 3, 5, 8, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 26, 27, 33.
Statements in Problems 23, 24, 25 in Section 2.1, Theorems in Section 2.2 and all statements in the Invertible Matrix Theorem can be expressed as implications.
- In general, whenever you try to prove something, you should formulate the statement of the problem as a clear implication. That is, as a statement of the form
  
  If , then .
  
  In the preceding displayed implication the green space is replaced by a mathematical property which is assumed and the red space is replaced by a mathematical property which is to be proved.
- When talking abstractly about logical statements we denote the green space by a letter, say $p$ and the red space by another letter, say $q$. Then we write implication as
  
  If $p$, then $q$.
- The logical symbol for the implication is $\Rightarrow$ as in
  
  $p \Rightarrow q$
- The negation of the implication
  $p \Rightarrow q$
  is
  
  The negation
  
  $p$ and $\neg q$.
  
  Here $\neg q$ stands for the negation of the statement $q.$ As always, a good question is: Why is this the negation of the implication $p\Rightarrow q$? For an answer, I would argue that when we think about propositions carefully our brains are wired to think that way. You can try on your favorite implication. I will illustrate with a popular implication: "If life is present on a planet, then water is present on that planet." Or briefly, "If life, then water." This claim sounds plausible from our experience. However, what would it take for this claim to turn out to be False? First identify $p$ and $q$ in this claim: $p$ is "life is present on a planet" and $q$ is "water is present on a planet." To demonstrate that "If life, then water" is False we would need to find a planet on which life is present, but water is not present. Thus $p$ is true and $q$ is false, or, equivalently $p$ and $\neg q$ are both true.
- With any implication you should be aware of its contrapositive. The contrapositive of the implication $p \Rightarrow q$ involves the negations of $q$ and $p$. Mathematical notation of the negation of $q$ is $\neg q$ and the negation of $p$ is $\neg p$. The contrapositive of the implication $p \Rightarrow q$ is
  
  The contrapositive
  
  $\neg q \Rightarrow \neg p$
  
  The contrapositive is equivalent to the original implication.
  
  Why is the contrapositive equivalent to the original implication? Look at the negation of the contrapositive:
  
  The negation of the contrapositive
  
  $\neg q$ and $p$.
  
  Compare this box to the negation of the implication $p \Rightarrow q$. They are identical.
  
  I must admit that I made a shortcut in stating the negation of the contrapositive. The true negation of the contrapositive is: $\neg q$ and $\neg(\neg p)$ are both true. However, $\neg(\neg p)$ is equal to $p.$ Hence, the negation of the contrapositive is: $\neg q$ and $p$ are both true; which is identical to the negation of the original implication $p\Rightarrow q.$
- Recall that the converse of the implication $p \Rightarrow q$ is the implication $q \Rightarrow p$. Also recall that the converse can be wrong when the original proposition is true.
- Almost all mathematical statements can be stated as implications. A symbolic way to represent an implication is $p \Rightarrow q$, where $p$ and $q$ are mathematical statements. There are many ways to say $p \Rightarrow q$ in English.
  - If $p$, then $q$.
  - $q$ if $p$.
  - $p$ is sufficient for $q$.
  - $q$ is necessary for $p$.
  - A sufficient condition for $q$ is $p$.
  - A necessary condition for $p$ is $q$.
  - $p$ implies $q$.
  - $p$ only if $q$.
  - $q$ whenever $p$.
  - $q$ follows from $p$.
  The concepts of necessary and sufficient conditions are commonly used in sciences. If you google "water is necessary for life," Google will respond with millions of hits. In our terminology, "water is necessary for life" means "if life, then water." Interestingly, if you google "life is sufficient for water," there will be no hits; although, the statements "water is necessary for life" and "life is sufficient for water" are equivalent.
- Recall that the converse of the implication $p \Rightarrow q$ is the implication $q \Rightarrow p$. Also recall that the converse can be false when the original implication is true, like in my favorite example "If it rains, then the Red Square is wet." Like with all things in life, this validity of this statement is debatable (but I would argue that it is expensive to make this statement false). However, the converse is definitely not true. It is easy to make the converse false. However, I do not recommend that we do that without the approval of the Red Square authorities.
All this logic introduction is for the better understanding of the Invertible Matrix Theorem. For easier comprehension of the twelve parts of the Invertible Matrix Theorem I created this picture You should first understand the proofs of the implications corresponding to the green arrows in the picture above. Sometimes the proofs of the green implications are just restatements of the definitions. The next step is understanding the implications corresponding to the blue arrows. Finally, below I present the proofs of the implications corresponding to the orange arrows. Both of these proofs involve the fact that the process of the row reduction to the RREF can be represented as a product of a sequence of elementary matrices.
Now proof of the implication (g)$\Rightarrow$(c). That is we need to prove:
If for all $\mathbf{b} \in \mathbb{R}^n$ the matrix equation $A\mathbf{x} = \mathbf{b}$ is consistent, then $A$ has $n$ pivot columns.
I will prove the contrapositive:
If $A$ has $m$ pivot columns and $m \lt n$, then there exists $\mathbf{b} \in \mathbb{R}^n$ such that $A\mathbf{x} = \mathbf{b}$ is not consistent.
Proof.
- Step 1. Assume that $A$ has $m$ pivot columns and $m \lt n.$ Then there are $m$ pivot positions in the RREF of $A.$ Since each pivot position occupies its own row, there are $m$ rows in the RREF of $A$ which contain pivot positions. The rows which contain the pivot positions are at the top of the RREF of $A.$ The remaining rows at the bottom of the RREF of $A$ are zero rows. Since $m \lt n$, the bottom row of the RREF of $A$ is a zero row.
- Step 2. Let $q \in\mathbb{N}.$ Assume that the RREF of $A$ has been obtained by performing $q$ row operations. Denote by $E_1,\ldots,E_q$ the corresponding elementary matrices. Then The matrix $E_q \cdots E_1 A$ is the RREF of $A$.
- Step 3. From Step 2 and Step 1 we conclude that The bottom row of the matrix $E_q \cdots E_1 A$ is the zero row. Therefore, The matrix equation $(E_q \cdots E_1 A) \mathbf{x} = \mathbf{e}_{n,n}$ is not consistent.
- Step 4. We know that changing a system of linear equations by performing row operations does not change the solution set of linear system.
  Therefore, the matrix equation $(E_q \cdots E_1 A) \mathbf{x} = \mathbf{e}_{n,n}$ and the matrix equation $ A \mathbf{x} = (E_1)^{-1} \cdots (E_q)^{-1}\mathbf{e}_{n,n}$ have the same solution set.
- Step 5. From Step 3 and Step 4 we deduce that the matrix equation $A \mathbf{x} = (E_1)^{-1} \cdots (E_q)^{-1}\mathbf{e}_{n,n}$ is not consistent.
- Step 6. From Step 5 we deduce that there exists $\mathbf{b} = (E_1)^{-1} \cdots (E_q)^{-1}\mathbf{e}_{n,n} \in \mathbb{R}^n$ such that the matrix equation $A \mathbf{x} = \mathbf{b}$ is not consistent.
- QED.
Next, I present a proof of the implication (b)$\Rightarrow$(a). That is we need to prove:
Let $n\in\mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. If the RREF of $A$ is $I_n$, then $A$ is invertible.
This implication is proved in Theorem 7 in the textbook, but I prefer to give another proof which uses only the facts that elementary matrices are invertible and that matrix multiplication is associative. I also like that the proof below proves the invertibility by using the definition of invertibility. The proof below was suggested to me by a student during Fall Quarter 2021. Unfortunately I forgot who it was. In any case, please think on your own about each proof. You can come up with new proofs.

In the proof below we need to prove that a matrix is invertible. For that it is useful to recall the definition of invertibility.

Definition. Let $n\in \mathbb{N}$ and let $I_n$ be the $n\!\times\!n$ identity matrix. An $n\!\times\!n$ matrix $A$ is said to be invertible if there exists an $n\!\times\!n$ matrix $C$ such that \[ CA = I_n \quad \text{and} \quad AC = I_n. \]
Proof.
- Step 1. Let $n\in\mathbb{N}$ and let $A$ be an $n\!\times\!n$ matrix. Assume that the Reduced Row Echelon Form of $A$ is $I_n.$ This means that there exists a positive integer $p$ and $p$ row operators that row reduce $A$ to $I_n.$ Since each row operation can be represented by matrix multiplication from the left by an elementary matrix, there exist elementary matrices $E_1, E_2,\ldots, E_p$ such that the row reduction of the matrix $A$ looks like \[ A \sim E_1 A \sim E_2(E_1 A) \sim \cdots \sim E_p(E_{p-1} \cdots E_1A) = I_n. \] Since matrix multiplication is associative we have \[ (E_p \cdots E_1) A = I_n. \] Thus, we can set $C = E_p \cdots E_1$ in the definition of an invertible matrix and we have $CA = I_n.$
- Step 2. Since the matrices $E_1, E_2, \ldots, E_{p-1}, E_p$ are elementary matrices, they are invertible. Therefore we have the matrices $(E_1)^{-1}, (E_2)^{-1}, \ldots, (E_{p-1})^{-1}, (E_p)^{-1}.$ Recall the equality \[ (E_p \cdots E_1) A = I_n. \] from Step 1. Now we multiply the preceding equality from the left by $(E_p)^{-1}$ to get \[ (E_p)^{-1}(E_p E_{p-1} \cdots E_2 E_1) A = (E_p)^{-1} I_n. \] Since matrix multiplication is associative we get \[ (E_{p-1} \cdots E_2 E_1) A = (E_p)^{-1}. \] Next we keep multiplying the preceding equality consecutively by $(E_{p-1})^{-1}, \ldots, (E_2)^{-1}$ to get \[ E_1 A =(E_2)^{-1} \cdots (E_{p-1})^{-1} (E_p)^{-1}. \] Finally we multiply the preceding equality by $(E_1)^{-1}$ to get \[ A =(E_1)^{-1} (E_2)^{-1} \cdots (E_{p-1})^{-1} (E_p)^{-1}. \] Now we multiply the preceding equality from the right with $p$ matrices $E_p, E_{p-1},\ldots, E_2, E_1$ to get \[ A (E_p E_{p-1} \cdots E_2 E_1) = I_n. \] Recall that in Step 1 we set $C = E_p E_{p-1} \cdots E_2 E_1.$ Thus, the last displayed equality reads $AC = I_n.$ To summarize, we have proved that the following two matrix equalities hold \[ C A = I_n \quad \text{and} \quad A C = I_n. \] Therefore, by the definition of invertible matrix $A$ is invertible.
- QED.
I think that the best way to prove the implication (d)$\Rightarrow$(c) is to prove its contrapositive. You can try this proof on your own. I think that it is a blue proof.

Thursday, February 3, 2022

Today we started Section 2.2: The inverse of a matrix; suggested problems are: 1, 4, 5, 7, 12, 13, 21, 22, 23, 24, 28, 32, 33, 34, 38. In all bold face problems the textbook asks you to "explain why" something is true. In the text of the book and in class we offer proofs. Please try to organize your solutions as proofs, not explanations. State clearly what is known from theorems in the book and use logical reasoning to deduce what the problem claims.
Consider the following problem:
- Determine whether the matrix $A = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right]$ is invertible.
- If $A$ is invertible find its inverse $A^{-1}$.
You can practice the procedure below with the matrix $\left[\begin{array}{rrr} 1 & 0 & -1 \\ 0 & 2 & 1 \\ -1 & 1 & 1 \end{array}\right]$ which I did in class.
To answer the first question in the problem we use the implication
If RREF of $A$ is $I_3$, then $A$ is invertible.
This implication is proved in Theorem 7 in Section 2.2 . This proof is important!
So, we just row reduce $A$ to RREF and that will answer the first question in the exercise: \begin{align*} \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] & \sim \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \\ \end{align*}
The row reduction above shows that $A$ is row equivalent to $I_3$. Therefore, by Theorem 7 in Section 2.2 , $A$ is invertible.
To calculate the inverse $A^{-1}$ we row reduce the $3\times 6$ matrix $[A | I_3]$: \begin{align*} \left[\!\begin{array}{rrr|rrr} 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 2 & 1 & 0 & 1 & 0 \\ 2 & 2 & 1 & 0 & 0 & 1 \end{array}\right] & \sim \left[\!\begin{array}{rrr|rrr} 1 & 1 & 0 & 1 & 0 & 0 \\ -1 & 0 & 0 & 0 & 1 & -1 \\ 2 & 2 & 1 & 0 & 0 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr|rrr} 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & -1 & 1 \\ 2 & 2 & 1 & 0 & 0 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr|rrr} 1 & 1 & 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 & -1 & 1 \\ 0 & 0 & 1 & -2 & 0 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr|rrr} 0 & 1 & 0 & 1 & 1 & -1 \\ 1 & 0 & 0 & 0 & -1 & 1 \\ 0 & 0 & 1 & -2 & 0 & 1 \end{array}\right] \\ & \sim \left[\!\begin{array}{rrr|rrr} 1 & 0 & 0 & 0 & -1 & 1 \\ 0 & 1 & 0 & 1 & 1 & -1 \\ 0 & 0 & 1 & -2 & 0 & 1 \end{array}\right] \\ \end{align*}
We pause here to celebrate the fact that we found the inverse of the matrix $A.$ Let us verify this claim: \[ \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 0 & -1 & 1 \\ 1 & 1 & -1 \\ -2 & 0 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \]

Each step in a row reduction can be achieved by multiplication by an elementary matrix.

Step	the row operation	the elementary matrix	the inverse of elementary matrix
1st	The second row is replaced by the the sum of the second row and the third row multiplied by (-1)	$E_1 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1 \end{array}\right]$	$E_1^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right]$
2nd	The second row scaled (multiplied) by (-1)	$E_2 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right]$	$E_2^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right]$
3rd	The third row is replaced by the the sum of the third row and the first row multiplied by $(-2)$	$E_3 = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array}\right]$	$E_3^{-1} = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array}\right]$
4th	The first row is replaced by the the sum of the first row and the second row multiplied by (-1)	$E_4 = \left[\!\begin{array}{rrr} 1 & -1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]$	$E_4^{-1} = \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]$
5th	The first row and the second row are interchanged	$E_5 = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right]$	$E_5^{-1} = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right]$

Next we use the elementary matrices $E_1,$ $E_2,$ $E_3,$ $E_4$ and $E_5$ to reconstruct the row reduction above: \begin{align*} E_1 A & = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right], \\ E_2 (E_1 A) & = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right], \\ E_3 (E_2 E_1 A) & =\left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right], \\ E_4 (E_3 E_2 E_1 A) & = \left[\!\begin{array}{rrr} 1 & -1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right], \\ E_5 (E_4 E_3 E_2 E_1 A) & = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]. \end{align*}
Since matrix multiplication is associative, we have \[ (E_5 E_4 E_3 E_2 E_1) A = I_3. \]
The reason that I wrote all the above detailed steps is to show that we can write $A$ as a product of elementary matrices. We start from the preceding equality \[ (E_5 E_4 E_3 E_2 E_1) A = I_3 \] and multiply both sides, first by $E_5^{-1}$, then by $E_4^{-1}$, then by $E_3^{-1}$, then by $E_2^{-1}$, then by $E_1^{-1}$ and finally we get \[ A = E_1^{-1} E_2^{-1} E_3^{-1} E_4^{-1} E_5^{-1}. \] Let us verify this claim: \begin{align*} E_5^{-1} & = \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right], \\ E_4^{-1} E_5^{-1} & = \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] = \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right], \\ E_3^{-1} \bigl(E_4^{-1} E_5^{-1}\bigr) & =\left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right], \\ E_2^{-1} \bigl(E_3^{-1} E_4^{-1} E_5^{-1}\bigr) & = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right], \\ E_1^{-1} \bigl(E_2^{-1} E_3^{-1} E_4^{-1} E_5^{-1}\bigr) & = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right] \left[\begin{array}{rrr} 1 & 1 & 0 \\ -1 & 0 & 0 \\ 2 & 2 & 1 \end{array}\right] = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] \end{align*} Thus, we confirmed that \[ A = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right] = \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 1 & 1 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right] \left[\!\begin{array}{rrr} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{array}\right] \]
The importance of the preceding item is that we found an expression for the matrix $A$ as a product of elementary matrices. In fact, the reasoning presented above for a specific $3\!\times\!3$ matrix $A$ holds for an arbitrary $n\!\times\!n$ matrix $A$ which can be row reduced to the identity matrix $I_n.$ The above reasoning is used as a model for a proof of the following important implication:
If the Reduced Row Echelon Form of $A$ is $I_n$, then $A$ is invertible.
This implication is proved in Theorem 7 in Section 2.2.
After reading this post you should be able to solve a problem stated as follows:

Consider the matrix $M = \left[\begin{array}{rrr} 3 & 3 & 2 \\ 3 & 2 & 1 \\ 2 & 1 & 0 \end{array}\right]$.
- Determine whether it is possible to write the matrix $M$ as a product of elementary matrices.
- If you claim that it is possible to write $M$ as a product of elementary matrices, then find elementary matrices whose product is $M$. If you claim that it is not possible to write $M$ as a product of elementary matrices, justify your answer.
The matrix $A = \left[\begin{array}{rrr} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 2 & 2 & 1 \end{array}\right]$ whose inverse we found above and the matrix $M=\left[\begin{array}{rrr} 3 & 3 & 2 \\ 3 & 2 & 1 \\ 2 & 1 & 0 \end{array}\right]$ which I assigned in the problem above are very special matrices. These matrices have integer entries and all the entries of their inverses are also integers. In exercises it is often convenient to have such matrices: matrices with integer entries whose inverse is also a matrix with integer entries. Such matrices are called unimodular matrices. I was surprised that there are a lot of such matrices. Here is a pdf file with all unimodular matrices with entries $-1,0,1,2,3$ whose inverses have entries among the ten digits and their opposites. I have omitted the matrices with 3 or more zeros. This pdf file is huge, 3119 pages with over 80000 matrices. If you ever need a unimodular matrix I hope you find it here. You can use these matrices to practice row reduction since for every matrix listed in this file I list its inverse.

Wednesday, February 2, 2022

The most poetic application of matrix multiplication comes from row reduction. On Thursday, January 20, 2022 I posted a $4\!\times\!6$ matrix and later that day I posted its Reduced Row Echelon Form: \[ A = \Bigl[ \begin{array}{cccccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_3 & \mathbf{a}_4 & \mathbf{a}_5 & \mathbf{a}_6 \end{array} \Bigr] = \left[ \begin{array}{cccccc} 1 & 1 & 4 & 1 & 0 & 6 \\ 2 & 1 & 3 & 0 & 1 & 4 \\ 3 & 1 & 2 & 1 & 0 & 8 \\ 4 & 1 & 1 & 0 & 1 & 6 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccrcrc} 1 & 0 & -1 & 0 & 0 & 1 \\ 0 & 1 & 5 & 0 & 1 & 2 \\ 0 & 0 & 0 & 1 & -1 & 3 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right]. \]
From the RREF of $A$ we read that the matrix $A$ has three pivot columns; its first, its second and its fourth column are the pivot columns. We also read that the RREF of $A$ has three nonzero rows. Let us separate the pivot columns of $A$ in a separate $4\!\times\!3$ matrix and let us separate the three nonzero rows of the RREF of $A$ in a separate $3\!\times\!6$ matrix as follows: \[ \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 2 & 1 & 0 \\ 3 & 1 & 1 \\ 4 & 1 & 0 \\ \end{array} \right] \qquad \text{and} \qquad \left[ \begin{array}{ccrcrc} 1 & 0 & -1 & 0 & 0 & 1 \\ 0 & 1 & 5 & 0 & 1 & 2 \\ 0 & 0 & 0 & 1 & -1 & 3 \end{array} \right]. \]
What do we have now? We have the $4\!\times\!3$ matrix of the pivot columns of $A$, we have the $3\!\times\!6$ matrix of the nonzero rows of the RREF of $A$ and we started with the $4\!\times\!6$ matrix $A$. Do you sense poetry here? Something rhymes! Multiply matrices! Multiply matrices which are multiplyable and enjoy the magic! \[ \bbox[#7FBFBF, 8px, border: 3px solid teal]{ \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 2 & 1 & 0 \\ 3 & 1 & 1 \\ 4 & 1 & 0 \\ \end{array} \right] \left[ \begin{array}{ccrcrc} 1 & 0 & -1 & 0 & 0 & 1 \\ 0 & 1 & 5 & 0 & 1 & 2 \\ 0 & 0 & 0 & 1 & -1 & 3 \end{array} \right] = \left[ \begin{array}{cccccc} 1 & 1 & 4 & 1 & 0 & 6 \\ 2 & 1 & 3 & 0 & 1 & 4 \\ 3 & 1 & 2 & 1 & 0 & 8 \\ 4 & 1 & 1 & 0 & 1 & 6 \\ \end{array} \right]}. \] I colored this equality teal since it is beautealful.
In the posts on Thursday, January 20, 2022 I presented the background needed to understand why and how this poetic connection comes about. In the three handwritten pages below I will try to present these why and how in more detail.
Pause to enjoy the teal!

Tuesday, February 1, 2022

Today we conitnued our discussion of the matrix multiplication. I will post more about it below. We started Section 2.2: The Inverse of a Matrix; suggested problems are: 1, 4, 5, 7, 12, 13, 21, 22, 23, 24, 28, 32, 33, 34, 38. In all bold face problems the textbook asks you to "explain why" something is true. In the text of the book and in class we offer proofs. Please try to organize your solutions as proofs, not explanations. State clearly what is known from theorems in the book and use logical reasoning to deduce what the problem claims.
My approach to the inverse of a $2\!\times\!2$ matrix is a little different from the approach in the textbook. I look at the process of seeking: Given a $2\!\times\!2$ matrix $\displaystyle \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right]$ we seek another $2\!\times\!2$ matrix $\displaystyle \left[\!\begin{array}{cc} x & y \\ z & w \end{array}\!\right]$ such that \[ \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right]\left[\!\begin{array}{cc} x & y \\ z & w \end{array}\!\right] = \left[\!\begin{array}{cc} ax+bz & ay+bw \\ cx+dz & cy+dw \end{array}\!\right] = \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\!\right]. \] At the first step we focus on the zeros and seek $y$ and $w$ such that $ay+bw = 0$, and $x$ and $z$ such that $cx+dz = 0.$ One of the simplest solution of $ay+bw = 0$ is $a(-b)+b\,a = 0$ and one of the simplest solution of $cx+dz = 0$ is $c\,d+d(-b) = 0.$ (This might take some guessing since you have another choice of $c(-d)+d\,b = 0.$) Now try $\displaystyle \left[\!\begin{array}{cc} d & -b \\ -c & a \end{array}\!\right]$: \[ \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right]\left[\!\begin{array}{cc} d & -b \\ -c & a \end{array}\!\right] = \left[\!\begin{array}{cc} ad-bc & 0 \\ 0 & c(-b)+da \end{array}\!\right] = (ad-bc) \left[\!\begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array}\!\right]. \] From the last equality we see that if $ad-bc \neq 0$, then the matrix $\displaystyle \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right]$ is invertible and its inverse is \[ \displaystyle \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right]^{-1} = \frac{1}{ad-bc}\left[\!\begin{array}{cc} d & -b \\ -c & a \end{array}\!\right]. \] My point here is that you do not have to memorize the inverse, you can construct it on your own just playing with the equations $ay+bw = 0$ and $cx+dz = 0$ and their simple solution.

The number $ad-bc$ is called the determinant of the matrix $\displaystyle \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right]$. We write \[ \det \left[\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right] = \left|\!\begin{array}{cc} a & b \\ c & d \end{array}\!\right| = ad-bc. \]
It is important for you to perfect calculation of the inverse of an invertible matrix. For a $2\!\times\!2$ matrix one should use Theorem 4, that is the method explained in the preceding item. For a $3\!\times\!3$ matrix one uses the method presented in Example 7.
There is never enough that can be done to further our understanding of the matrix multiplication. The following handwritten page points out two different aspects of the matrix multiplication. I call these two aspects Focus on Columns and Focus on Rows. I hope that the following handwritten page explains both. And don't forget the last handwritten page from yesterday in which I presented the third aspect of matrix multiplication which is most often used to multiply matrices "by hand," that is row-column multiplication rule.

Monday, January 31, 2022

Today we discussed Section 2.1: Matrix operations. For now, the objective is to learn matrix multiplication and apply it to verify row reduction. Suggested problems are: 1, 4, 5, 7, 9, 10, 11, 12, 17, 20, 22, 23, 24, 25, 27, 28, 34. The jewel of this section is Matrix Multiplication. It is a beautiful and rich concept.
Here is my attempt to illustrate matrix multiplication with colors:
Here is another attempt to illustrate matrix multiplication with colors. This time the number the matrix $A$ on the left has $n$ columns and the matrix $B$ on the right has $n$ rows. I illustrated this by representing the rows of $A$ as wide narrow ractangles and by representing the columns of $B$ as tall narrow rectangles.

Friday, January 28, 2022

We discussed the remaining part of Section 1.9: Definitions one-to-one transformation (I prefer injective transformation, or just injection) and onto transformation (I prefer surjective transformation, or just surjection) on page 76 and the content after these definitions. Related problems: 25, 26, 27, 28, 29, 30 and the problems for the first part of Section 1.9: 1, 3, 4, 5, 7, 8, 11, 12, 18, 19, 23.
It is important to note that the concepts of injection and surjection are defined for functions in general, not only for linear transformations as in our textbook.

I wrote my own webpage about functions from the point of view of sets. I did try to include all standard examples of inverse functions from precalculus here.

Monday, January 24, 2022 (updated)

We did Section 1.8 on Friday and Section 1.9 today. Notice that we did not discuss Definitions on page 76 and the content after these definitions.
Suggested problems for Section 1.8: 1-4, 12, 13-17, 19, 20, 25, 27, 28
Suggested problems for Section 1.9: 1, 3, 4, 5, 7, 8, 11, 12, 18, 19, 23
In the textbook the authors list several reflections transformations across some popular lines like: $x_2 = 0$ (this is $x_1$-axis), $x_1 = 0$ (this is $x_2$-axis), $x_2 = x_1$ (this is the diagonal of the first and the third quadrant), and $x_2 = -x_1$ (this is the diagonal of the second and the fourth quadrant). But why not establish the standard matrix of the reflection across the line which makes the angle $\theta$ with the positive $x_1$-axis? It is not more difficult than the standard matrix of the rotation by the angle $\theta$ in counterclockwise direction, which is \[ \left[\! \begin{array}{cc} \cos(\theta) & -\sin(\theta) \\[5pt] \sin(\theta) & \cos(\theta) \end{array}\!\right] \]
The picture below will help us determine the standard matrix of the reflection across the blue line which makes angle $\theta$ with the positive $x_1$-axis. Denote this reflection by \[ F:\mathbb{R}^2 \to \mathbb{R}^2. \] To determine the standard matrix of this reflection we need to calculate the coordinates of the vectors \[ F \mathbf{e}_1 = F \left[\! \begin{array}{c} 1 \\[3pt] 0 \end{array}\!\right] \quad \text{and} \quad F \mathbf{e}_2 = F \left[\! \begin{array}{c} 0 \\[3pt] 1 \end{array}\!\right] \] In the picture below the vector $F \mathbf{e}_1$ is dark orange and the vector $F \mathbf{e}_2$ dark purple.
- By the definition of the reflection $F$, the angle formed by the light orange coordinate vector $\mathbf{e}_1$ and its reflection the dark orange vector $F \mathbf{e}_1$ is $2\theta.$ Therefore, \[ F \mathbf{e}_1 = F \left[\! \begin{array}{c} 1 \\[3pt] 0 \end{array}\!\right] = \left[\! \begin{array}{c} \cos(2\theta) \\[3pt] \sin(2\theta) \end{array}\!\right]. \]
- In this item we calculate the coordinates of the dark purple vector $F \mathbf{e}_2$ which is the reflection of the light purple coordinate vector $\mathbf{e}_2.$ First, we observe that the light gray right triangles in the picture below are congruent. This claim follows from the fact that the hypothenuse of each of these triangles is of length $1$ and the triangles have the same angle $2\theta$ at the origin. That the angle at the origin of the top light gray triangle is $2\theta$ follows from the definition of the reflection. To calculate the angle at the origin for the lower light gray triangle we calculate the angle between the light purple coordinate vector $\mathbf{e}_2$ and its reflection, the dark purple vector $F \mathbf{e}_2;$ since the angle between the light purple coordinate vector $\mathbf{e}_2$ and the blue line is $\displaystyle \frac{\pi}{2} - \theta$, the angle between is $\mathbf{e}_2$ and $F \mathbf{e}_2$ is \[ 2 \left(\frac{\pi}{2} - \theta\right) = \pi - 2 \theta. \] Since the sum of this angle and the angle at the origin for the lower light gray triangle equals $\pi,$ we have proven that the angle at the origin for the lower light gray triangle equals $2\theta.$ In the preceding item we proved that the horizontal side of the top light gray right triangle is $\cos(2\theta)$ and its vertical side is $\sin(2\theta)$. As a consequence, the horizontal side of the lower light gray right triangle is $\sin(2\theta)$ and its vertical side is $\cos(2\theta)$. Adjusting for the positioning in the coordinate system we get \[ F \mathbf{e}_2 = F \left[\! \begin{array}{c} 0 \\[3pt] 1 \end{array}\!\right] = \left[\! \begin{array}{c} \sin(2\theta) \\[3pt] -\cos(2\theta) \end{array}\!\right]. \] Finally we have that the standard matrix of the reflection across the blue line which makes angle $\theta$ with the positive $x_1$-axis is \[ \left[\! \begin{array}{cc} \cos(2\theta) & \sin(2\theta) \\[5pt] \sin(2\theta) & -\cos(2\theta) \end{array}\!\right]. \] Enjoy Reflections!
Enjoy Reflections!

Sunday, January 23, 2022

On each in-class exam, I assign four problems. Often these problems have several parts (a), (b), and sometimes (c) parts. In this file you can find some problems in the form that they could appear on the exam.

Friday, January 21, 2022

We discussed Section 1.8 today. Suggested problems for Section 1.8: 1-4, 12, 13-17, 19, 20, 25, 27, 28.

Thursday, January 20, 2022

We discussed Section 1.7: Linear independence today. Suggested problems: 2, 4, 5, 8, 10, 11, 17, 21, 23, 24, 25, 26, 27, 28, 29, 32, 33, 34, 35, 36, 37, 38, 39, 40 and read the next item.
Problems 41, 42, 43, 44 in Section 1.7: Linear independence are very interesting and important. The matrices in these problems are not easy to row reduce by hand, so the textbook recommends that we use a calculator. Below I calculated RREFs for the matrices given in Problems 41 and 42. Based on these RREFs you should be able to answer Problems 41, 42, 43, 44.

Problem 41 \[ \left[ \begin{array}{rrrrrr} 8 & -3 & 0 & -7 & 2 \\ -9 & 4 & 5 & 11 & -7 \\ 6 & -2 & 2 & -4 & 4 \\ 5 & -1 & 7 & 0 & 10 \\ \end{array} \right] \sim \quad \cdots \quad \sim \left[ \begin{array}{ccccc} 1 & 0 & 3 & 1 & 0 \\ 0 & 1 & 8 & 5 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \]

Problem 42 \[ \left[ \begin{array}{rrrrrr} 12 & 10 & -6 & -3 & 7 & 10 \\ -7 & -6 & 4 & 7 & -9 & 5 \\ 9 & 9 & -9 & -5 & 5 & -1 \\ -4 & -3 & 1 & 6 & -8 & 9 \\ 8 & 7 & -5 & -9 & 11 & -8 \\ \end{array} \right] \sim \quad \cdots \quad \sim \left[ \begin{array}{rrrrrr} 1 & 0 & 2 & 0 & 2 & 0 \\ 0 & 1 & -3 & 0 & -2 & 0 \\ 0 & 0 & 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \]
Below I am offering a simpler matrix and more detailed guidance for the calculations needed for Problems 41, 42, 43, 44 in Section 1.7. This is my version of these four problems.
Problem. Consider the following $4\times 6$ matrix \[ A = \left[ \begin{array}{cccccc} 1 & 1 & 4 & 1 & 0 & 6 \\ 2 & 1 & 3 & 0 & 1 & 4 \\ 3 & 1 & 2 & 1 & 0 & 8 \\ 4 & 1 & 1 & 0 & 1 & 6 \\ \end{array} \right]. \] Denote the columns of $A$ by $\mathbf{a}_1,$ $\mathbf{a}_2,$ $\mathbf{a}_3,$ $\mathbf{a}_4,$ $\mathbf{a}_5,$ and $\mathbf{a}_6.$ That is, \[ \mathbf{a}_1 = \left[ \begin{array}{cccccc} 1 \\ 2 \\ 3 \\ 4 \\ \end{array} \right], \ \mathbf{a}_2 = \left[ \begin{array}{c} 1\\ 1 \\ 1 \\ 1 \\ \end{array} \right], \ \mathbf{a}_3 = \left[ \begin{array}{c} 4 \\ 3 \\ 2 \\ 1 \\ \end{array} \right], \ \mathbf{a}_4 = \left[ \begin{array}{c} 1 \\ 0 \\ 1 \\ 0 \\ \end{array} \right], \ \mathbf{a}_5 = \left[ \begin{array}{c} 0\\ 1\\ 0\\ 1 \\ \end{array} \right], \ \mathbf{a}_6 = \left[ \begin{array}{c} 6 \\ 4 \\ 8 \\ 6 \\ \end{array} \right]. \] Answer the following questions:
1. Find the Reduced Row Echelon Form of $A$. (Please make sure that it is done correctly.)
2. Is the vector $\mathbf{a}_1$ linearly independent? Explain using the definition.
3. Are the vectors $\mathbf{a}_1,$ $\mathbf{a}_2$ linearly independent? Explain using the definition.
4. Are the vectors $\mathbf{a}_1,$ $\mathbf{a}_2,$ $\mathbf{a}_3$ linearly independent? Explain using the definition. If the answer is No, then express $\mathbf{a}_3$ as a linear combination of $\mathbf{a}_1$ and $\mathbf{a}_2.$ That is, find the real numbers $\alpha_1$ and $\alpha_2$ such that \[ \mathbf{a}_3 = \alpha_1 \mathbf{a}_1+ \alpha_2 \mathbf{a}_2. \] Make a record of the numbers $\alpha_1$ and $\alpha_2.$
5. Are the vectors $\mathbf{a}_1,$ $\mathbf{a}_2,$ $\mathbf{a}_4$ linearly independent? Explain using the definition.
6. Are the vectors $\mathbf{a}_1,$ $\mathbf{a}_2,$ $\mathbf{a}_4,$ $\mathbf{a}_5$ linearly independent? Explain using the definition. If the answer is No, then express $\mathbf{a}_5$ as a linear combination of $\mathbf{a}_1,$ $\mathbf{a}_2$ and $\mathbf{a}_4.$ That is, find the real numbers $\beta_1,$ $\beta_2$ and $\beta_4$ such that \[ \mathbf{a}_5 = \beta_1 \mathbf{a}_1+ \beta_2 \mathbf{a}_2 + \beta_4 \mathbf{a}_4. \] Make a record of the numbers $\beta_1,$ $\beta_2$ and $\beta_4.$
7. Are the vectors $\mathbf{a}_1,$ $\mathbf{a}_2,$ $\mathbf{a}_4,$ $\mathbf{a}_6$ linearly independent? Explain using the definition. If the answer is No, then express $\mathbf{a}_6$ as a linear combination of $\mathbf{a}_1,$ $\mathbf{a}_2$ and $\mathbf{a}_4.$ That is, find the real numbers $\gamma_1,$ $\gamma_2$ and $\gamma_4$ such that \[ \mathbf{a}_6 = \gamma_1 \mathbf{a}_1+ \gamma_2 \mathbf{a}_2 + \gamma_4 \mathbf{a}_4. \] Make a record of the numbers $\gamma_1,$ $\gamma_2$ and $\gamma_4.$
8. The moral of this problem is that all the information you found about the linear independence and dependence of the columns of $A$ is contained in the Reduced Row Echelon Form of $A$. I would like you to carefully review your findings and discover how are all the facts about the columns of $A$ that you found in this problem encoded in the Reduced Row Echelon Form of $A$.
Let me answer some of the questions asked in the above boxed Problem.
1. The row reduction algorithm applied to the matrix $A$ gives its Reduced Row Echelon Form: \[ A = \Bigl[ \begin{array}{cccccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_3 & \mathbf{a}_4 & \mathbf{a}_5 & \mathbf{a}_6 \end{array} \Bigr] = \left[ \begin{array}{cccccc} 1 & 1 & 4 & 1 & 0 & 6 \\ 2 & 1 & 3 & 0 & 1 & 4 \\ 3 & 1 & 2 & 1 & 0 & 8 \\ 4 & 1 & 1 & 0 & 1 & 6 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccrcrc} 1 & 0 & -1 & 0 & 0 & 1 \\ 0 & 1 & 5 & 0 & 1 & 2 \\ 0 & 0 & 0 & 1 & -1 & 3 \\ 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right]. \] Remark We obtained the preceding RREF by performing row operations on the rows of $A.$ It is important to realize that the same row operations would lead to the following Reduced Row Echelon Forms: \[ \Bigl[ \begin{array}{cc} \mathbf{a}_1 & \mathbf{a}_2 \end{array} \Bigr] = \left[ \begin{array}{cc} 1 & 1 \\ 2 & 1 \\ 3 & 1 \\ 4 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \\ \end{array} \right] \] and \[ \Bigl[ \begin{array}{ccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_3 \end{array} \Bigr] = \left[ \begin{array}{ccc} 1 & 1 & 4 \\ 2 & 1 & 3 \\ 3 & 1 & 2 \\ 4 & 1 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccrcrc} 1 & 0 & -1 \\ 0 & 1 & 5 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{array} \right] \] and \[ \Bigl[ \begin{array}{ccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_4 \end{array} \Bigr] = \left[ \begin{array}{ccc} 1 & 1 & 1 \\ 2 & 1 & 0 \\ 3 & 1 & 1 \\ 4 & 1 & 0 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \\ \end{array} \right] \] and \[ \Bigl[ \begin{array}{cccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_4 & \mathbf{a}_5 \end{array} \Bigr] = \left[ \begin{array}{cccc} 1 & 1 & 1 & 0 \\ 2 & 1 & 0 & 1 \\ 3 & 1 & 1 & 0 \\ 4 & 1 & 0 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccrcrc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 \\ \end{array} \right] \] and \[ \Bigl[ \begin{array}{cccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_4 & \mathbf{a}_5 \end{array} \Bigr] = \left[ \begin{array}{cccc} 1 & 1 & 1 & 0 \\ 2 & 1 & 0 & 1 \\ 3 & 1 & 1 & 0 \\ 4 & 1 & 0 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccrcrc} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & -1 \\ 0 & 0 & 0 & 0 \\ \end{array} \right] \] and \[ \Bigl[ \begin{array}{cccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_4 & \mathbf{a}_6 \end{array} \Bigr] = \left[ \begin{array}{cccc} 1 & 1 & 1 & 6 \\ 2 & 1 & 0 & 4 \\ 3 & 1 & 1 & 8 \\ 4 & 1 & 0 & 6 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccrcrc} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 2 \\ 0 & 0 & 1 & 3 \\ 0 & 0 & 0 & 0 \\ \end{array} \right]. \]
2. The vector $\mathbf{a}_1$ is linearly independent. Recall the definition of linear independence. A vector $\mathbf{v}_1$ is linearly independent if the homogeneous equation $x_1 \mathbf{v}_1 = \mathbf{0}$ has only the trivial solution. Consider the homogeneous equation $x_1 \mathbf{a}_1 = \mathbf{0}.$ The preceding equation is equivalent to \[ x_1 \left[ \begin{array}{cccccc} 1 \\ 2 \\ 3 \\ 4 \\ \end{array} \right] = \left[ \begin{array}{cccccc} 0 \\ 0 \\ 0 \\ 0 \\ \end{array} \right], \] which in turn is equivalent to $x_1 = 0.$ Hence the equation $x_1 \mathbf{a}_1 = \mathbf{0}$ has only the trivial solution. This proves that the vector $\mathbf{a}_1$ is linearly independent.
3. The vectors $\mathbf{a}_1$ and $\mathbf{a}_2$ are linearly independent. To prove this claim we need to prove that the homogeneous vector equation $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 = \mathbf{0}$ has only the trivial solution. Since we already know that \[ \Bigl[ \begin{array}{cc} \mathbf{a}_1 & \mathbf{a}_2 \end{array} \Bigr] = \left[ \begin{array}{cc} 1 & 1 \\ 2 & 1 \\ 3 & 1 \\ 4 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{cc} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \\ \end{array} \right], \] the equation $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 = \mathbf{0}$ is equivalent to the equation \[ x_1 \left[ \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ \end{array} \right] + x_2 \left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \\ \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 0 \\ \end{array} \right] \] and the last equation is equivalent to $x_1 = 0$ and $x_2 =0$. Hence $\mathbf{a}_1$ and $\mathbf{a}_2$ are linearly independent.
4. The vectors $\mathbf{a}_1,$ $\mathbf{a}_2,$ $\mathbf{a}_3$ are not linearly independent. These vectors are linearly dependent. We claim that the homogeneous vector equation $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + x_3 \mathbf{a}_3 = \mathbf{0}$ has infinitely many solution. Since we already know that \[ \Bigl[ \begin{array}{ccc} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_3 \end{array} \Bigr] = \left[ \begin{array}{ccc} 1 & 1 & 4 \\ 2 & 1 & 3 \\ 3 & 1 & 2 \\ 4 & 1 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{ccr} 1 & 0 & -1 \\ 0 & 1 & 5 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{array} \right], \] the equation $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 + x_3 \mathbf{a}_3= \mathbf{0}$ is equivalent to the equation \[ x_1 \left[ \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ \end{array} \right] + x_2 \left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \\ \end{array} \right] + x_2 \left[ \begin{array}{r} -1 \\ 5 \\ 0 \\ 0 \\ \end{array} \right]= \left[ \begin{array}{c} 0 \\ 0 \\ 0 \\ 0 \\ \end{array} \right]. \] In the last equation the variable $x_3$ is free. All solutions of this equation are given by $x_1 = s,$ $x_2 = -5s,$ $x_3 = s$ where $s$ is any real number. For $s=1$ we get a specific linear dependence relation \[ 1 \mathbf{a}_1 + (-5) \mathbf{a}_2 + 1 \mathbf{a}_3 = \mathbf{0}. \] A different way to get the same relation is to solve the nonhomogeneous equation $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 = \mathbf{a}_3.$ It is very interesting to observe that we already have everything that we need to solve this equation. We already know that \[ \Bigl[ \begin{array}{cc|c} \mathbf{a}_1 & \mathbf{a}_2 & \mathbf{a}_3 \end{array} \Bigr] = \left[ \begin{array}{ccc} 1 & 1 & 4 \\ 2 & 1 & 3 \\ 3 & 1 & 2 \\ 4 & 1 & 1 \\ \end{array} \right] \sim \cdots \sim \left[ \begin{array}{cc|r} 1 & 0 & -1 \\ 0 & 1 & 5 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{array} \right]. \] Therefore, $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 = \mathbf{a}_3$ is equivalent to the equation \[ x_1 \left[ \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ \end{array} \right] + x_2 \left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \\ \end{array} \right] = \left[ \begin{array}{r} -1 \\ 5 \\ 0 \\ 0 \\ \end{array} \right]. \] Hence the only solution of the nonhomogeneous equation $x_1 \mathbf{a}_1 + x_2 \mathbf{a}_2 = \mathbf{a}_3$ is $x_1 = -1$ and $x_2 = 5.$ That is \[ \mathbf{a}_3 = (-1) \mathbf{a}_1 +5 \mathbf{a}_2. \] Verify! \[ \left[ \begin{array}{r} 4 \\ 3 \\ 2 \\ 1 \\ \end{array} \right] = (-1) \left[ \begin{array}{c} 1 \\ 2 \\ 3 \\ 4 \\ \end{array} \right] + 5 \left[ \begin{array}{c} 1 \\ 1 \\ 1 \\ 1 \\ \end{array} \right]. \]
5. Similar to III.
6. Similar to IV.
7. Similar to IV and VI.
8. Would you please think about this? The connections that are present in RREF are beautiful. We will discuss these many times in future classes.

Tuesday, January 18, 2022

We discussed Section 1.6: Applications of Linear Systems. We did two applications: Balancing Chemical Equations and Network Flow. Suggested problems for Section 1.6: 5-8, 11-14. Please pay attention to Problems 11.
An important physical application of linear systems is in Problems 33 and 34 in Section 1.1. These problems show how to find an approximation for the steady-state temperature distribution of a thin plate when the temperature around the boundary is known. In this problem approximations for the steady-state temperature distribution are found at only four points. But this problem gives a blueprint on how to find the approximations at any rectangular grid of points. If we take a large grid, for example $400 = 20\times20$ points, then we would need to solve a linear system of 400 equations with 400 unknowns. It is an interesting challenge to figure out what would be the $400\times400$ matrix of this system. I did not pursue this question further, so I do not know how feasable this is. In Math 430 we show how to find the steady-state temperature distribution in a thin plate when the temperature around the boundary is known using partial differential equations.
At the end of the class, I talked about my favorite application of vectors: COLORS. In fact, I love this application so much that I wrote a webpage to celebrate it: Color Cube.

Friday, January 14, 2022

We finished Section 1.5. Suggested problems for Section 1.5: 1, 3, 5, 6, 7, 9, 11, 12, 13-16, 19, 21, 23, 24, 26, 29, 32, 35, 37-40.
Today we discussed Section 1.5 which talks about writing solution sets of linear systems in parametric vector form. We explained the relationship between the solution set of the homogeneous equation $\color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}$ and the solution set of a consistent nonhomogeneous equation $\color{green}{A}\color{red}{\mathbf{x}} = \color{green}{\mathbf{b}}$; see Theorem 6 for details. Please recognize how this theorem is reflected when the solution of $A \mathbf x = \mathbf 0$ is written in parametric vector form. Suggested problems for Section 1.5: 1, 3, 5, 6, 7, 9, 11, 12, 13-16, 19, 21, 23, 24, 26, 29, 32, 35, 37-40. When you write a formula for the solution of a nonhomogeneous equation in parametric form, try to recognize a particular solution $\color{purple}{\mathbf{p}}$ of the nonhomogeneous equation and a span of one, or two, or three vectors which is the solution of the corresponding homogeneous equation.
The content of Section 1.5 is very useful for the Problem which will be on the Final Assignment and which I posted on January 11.
Let $m, n \in \mathbb{N}.$ Let $\color{green}{A}$ be an $m\times n$ matrix. A few points about the solution set of a homogeneous equation \[ \color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}. \] The preceding matrix equation can be rewritten as a homogenous system with $m$ equations and $n$ unknowns. The unknowns are collected in the vector $\color{red}{\mathbf{x}} \in \mathbb{R}^n.$ Here $\mathbf{0}$ is the zero vector in $\mathbb{R}^m.$
- The homogeneous equation $\color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}$ always has a solution (it is consistent). One solution is always the zero vector in $\mathbb{R}^n.$
- Another explanation why the homogeneous equation $\color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}$ always has a solution is to consider its augmented matrix \[ \bigl[\begin{array}{c|c} \color{green}{A} & \mathbf{0} \end{array} \bigr]. \] As we perform row reduction, the last column will always be $\mathbf{0}.$ Therefore the last column of the reduced row echelon form of the matrix $\bigl[\begin{array}{c|c} \color{green}{A} & \mathbf{0} \end{array} \bigr]$ will not be a pivot column. Hence the matrix equation $\color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}$ is consistent.
- We have the following dichotomy for the solution:
  - The zero vector in $\mathbb{R}^n$ is a unique solution. This case occurs if and only if each column of the reduced row echelon form of $\color{green}{A}$ is a pivot column. In other words, there are no free variables.
  - There are infinitely many solutions. This case occurs if and only if there are free variables. In this case the solution set can be written as a span of several vectors. In fact, we can write the solution set as a span of as many vectors as there are free variables.
For example, let $\color{green}{A}$ be a $3\times 4$ matrix and assume that \[ \bigl[\begin{array}{c|c} \color{green}{A} & \mathbf{0} \end{array} \bigr] \sim \cdots \sim \left[\!\begin{array}{rrrr|r} 1 & -2 & 0 & -1 & 0 \\ 0 & 0 & 1 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{array}\! \right] \] In this case we have two free variables and the solution set of the matrix equation $\color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}$ can be written as a span of two vectors in $\mathbb{R}^4.$ To find those vectors we need to solve the corresponding system of linear equations: \begin{alignat*}{10} &x_1 & & - &&2 x_2 & & && && && - && && x_4 && = &&0\\ & & & && & & && &&x_3 && + && 2 && x_4 && = &&0 \end{alignat*} The solution of this system in parametric vector form is \[ \left[\!\begin{array}{c} x_1 \\ x_2 \\ x_3 \\ x_4 \end{array}\! \right] = \left[\!\begin{array}{c} 2 s + t \\ s \\ -2 t \\ t \end{array}\! \right] = \left[\!\begin{array}{c} 2 s + 1 t \\ 1 s + 0 t \\ 0 s -2 t \\ 0 s + 1 t \end{array}\! \right] = s \left[\!\begin{array}{c} 2 \\ 1 \\ 0 \\ 0 \end{array}\! \right] + t \left[\!\begin{array}{r} 1 \\ 0 \\ -2 \\ 1 \end{array}\! \right], \] where $s$ and $t$ are arbitrary real numbers. Hence, the solution set of the homogeneous matrix equation $\color{green}{A}\color{red}{\mathbf{x}} = \mathbf{0}$ can be written as \[ \operatorname{Span}\left\{ \left[\!\begin{array}{c} 2 \\ 1 \\ 0 \\ 0 \end{array}\! \right], \left[\!\begin{array}{r} 1 \\ 0 \\ -2 \\ 1 \end{array}\! \right] \right\}. \]

Thursday, January 13, 2022

We finished Section 1.4 and started Section 1.5. Suggested problems for Section 1.5: 1, 3, 5, 6, 7, 9, 11, 12, 13-16, 19, 21, 23, 24, 26, 29, 32, 35, 37-40.
Today after class a student proposed that we discuss Problem 25 in Section 1.1. This is a very interesting deep problem. Our discussion is recorded and the notes that I wrote are on Canvas in Discussions. Please also notice that this problem is related to the concept of span. I wrote about this in an item on January 10; the question starting with "Find a relationship between the coordinates."
Notice also that Problems 26 and 27 in Section 1.1 are very interesting. Think about them.

Tuesday, January 11, 2022

Today we discussed Section 1.4: The Matrix Equation $A\mathbf{x} = \mathbf{b}$. Suggested problems for Section 1.4: 1, 5, 13, 14, 15, 16,17-20, 21, 22, 23, 24, 25, 26, 32, 34, 35, 36.
The Problem in the next item is inspired by Problem 32 in Section 1.3. An interesting feature of this problem is that you do not need to know the specific coordinates of the vectors in the picture to answer the questions. You only need to record the linear relationships among vectors that are clear from the given grids: the vectors $\color{green}{\mathbf{a}_3},$ and $\color{#00FF00}{\mathbf{b}}$ are linear combinations of $\color{green}{\mathbf{a}_1},$ $\color{green}{\mathbf{a}_2}.$ At this point you can solve items i. and iii. in Problem in the next item. After you solved item iii., you can reconstruct ii. But, you will have more information to answer solve ii. as we learn more about RREF. An important note: The Problem in the next item will be on the final assignment. So, solving it early, and clarifying if there is anything not clear, assures success.
Problem Consider the vectors $\color{green}{\mathbf{a}_1},$ $\color{green}{\mathbf{a}_2},$ $\color{green}{\mathbf{a}_3},$ and $\color{#00FF00}{\mathbf{b}}$ in $\mathbb{R}^2$ as shown in the picture below. Answer the following questions and justify your answers.
1. Is the vector equation \[ \color{red}{x_1} \color{green}{\mathbf{a}_1} + \color{red}{x_2} \color{green}{\mathbf{a}_2} + \color{red}{x_3} \color{green}{\mathbf{a}_3} = \color{#00FF00}{\mathbf{b}} \] consistent or inconsistent.
2. Find the reduced row echelon form of the augmented matrix \[ \left[\begin{array}{ccc|c} \color{green}{\mathbf{a}_1} & \color{green}{\mathbf{a}_2} & \color{green}{\mathbf{a}_3} & \color{#00FF00}{\mathbf{b}} \end{array} \right]. \]
3. Find the solution set of the vector equation \[ \color{red}{x_1} \color{green}{\mathbf{a}_1} + \color{red}{x_2} \color{green}{\mathbf{a}_2} + \color{red}{x_3} \color{green}{\mathbf{a}_3} = \color{#00FF00}{\mathbf{b}} \] and represent the solution set in parametric vector form.
If you can solve item (ii) in the above problem, then the solution of item (iii) is clear. However, one strategy is to use what we learned in Section 1.5 to solve c. first and then from c. deduce what b. must be.

Monday, January 10, 2022

Today we discussed Section 1.3: Vector Equations. Suggested problems for Section 1.3: 1, 5, 9, 15, 17, 18, 19, 21-25, 32.
To internalize algebraic operations with vectors it is useful to draw pictures in $\mathbb{R}^2$ (the Cartesian plane) and $\mathbb{R}^3$ (the Euclidean space). That is what I will do below.
Consider two specific vectors in $\mathbb{R}^3$ \[ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} \quad \text{and} \quad \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \] as given in the picture below: Now add three more gray vectors to this picture These vectors are \[ \left[\! \begin{array}{r} 2 \\ 1 \\ -4 \end{array} \!\right] \quad \text{and} \quad \left[\! \begin{array}{r} 6 \\ -7 \\ 0 \end{array} \!\right] \quad \text{and} \quad \left[\! \begin{array}{r} -7 \\ -1 \\ 5 \end{array} \!\right] . \]
We ask the question: Is the vector $\left[\! \begin{array}{r} 2 \\ 1 \\ -4 \end{array} \!\right]$ in the span of the vectors $\bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}$ and $\bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]}$? Or, the same question in mathematical notation: Is the following statement true \[ \left[\begin{array}{r} 2 \\ 1 \\ -4 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}? \] And two more questions: Is the following statement true \[ \left[\begin{array}{r} 6 \\ -7 \\ 0 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}? \qquad \left[\begin{array}{r} -7 \\ -1 \\ 5 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}? \]

Really interesting aspect of linear algebra is that these three questions are just vector equations in disguise. In fact, we are asking if the following vector equations have solutions (are consistent): \[ x_1 \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} +x_2 \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} = \left[\begin{array}{r} 2 \\ 1 \\ -4 \end{array} \right], \quad x_1 \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} +x_2 \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} = \left[\begin{array}{r} 6 \\ -7 \\ 0 \end{array} \right], \quad x_1 \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} +x_2 \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} = \left[\begin{array}{r} -7 \\ -1 \\ 5 \end{array} \right]. \] In turn, these vector equations are in fact systems of linear equations, like we did on the first day of classes:

System 1	Systems 2	Systems 3
\begin{alignat}{8} &x_1 & & - 4 &x_2 & &=&& 2\\ -3 &x_1 & & + &x_2 & & =&& 1 \\ &x_1 & & + 2 &x_2 & & =&& -4 \end{alignat}	\begin{alignat}{8} &x_1 & & - 4 &x_2 & &=&& 6\\ -3 &x_1 & & + &x_2 & & =&&-7 \\ &x_1 & & + 2 &x_2 & & =&& 0 \end{alignat}	\begin{alignat}{8} &x_1 & & - 4 &x_2 & &=&& -7\\ -3 &x_1 & & + &x_2 & & =&& -1 \\ &x_1 & & + 2 &x_2 & & =&& 5 \end{alignat}

These systems are solved by row reducing the augmented matrices. But, this is huge, we can do three row reductions in one go: \begin{align*} \left[\!\begin{array}{rr|rrr} 1 & -4 & 2 & 6& -7 \\ -3 & 1 & 1 & -7& -1 \\ 1 & 2 & -4& 0& 5 \end{array}\! \right] & \sim \left[\!\begin{array}{rr|rrr} 1 & -4 & 2 & 6& -7 \\ 0 & -11 & 7 & 11 & -22 \\ 0 & 6 & -6& -6 & 12 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|rrr} 1 & -4 & 2 & 6& -7 \\ 0 & 1 & -\frac{7}{11} & -1 & 2 \\ 0 & 1 & -1& -1 & 2 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|rrr} 1 & -4 & 2 & 6& -7 \\ 0 & 1 & -\frac{7}{11} & -1 & 2 \\ 0 & 0 & -\frac{4}{11}& 0 & 0 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|rrr} 1 & -4 & 2 & 6& -7 \\ 0 & 1 & -\frac{7}{11} & -1 & 2 \\ 0 & 0 & 1 & 0 & 0 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|rrr} 1 & -4 & 0 & 6& -7 \\ 0 & 1 & 0 & -1 & 2 \\ 0 & 0 & 1 & 0 & 0 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|rrr} 1 & 0 & 0 & 2 & 1 \\ 0 & 1 & 0 & -1 & 2 \\ 0 & 0 & 1 & 0 & 0 \end{array}\! \right] \qquad \text{this matrix is in RREF} \end{align*}

Conclusions:
- Since the the first augmented column of the RREF is a pivot column and this augmented column corresponds to System 1, we conclude that System 1 is inconsistent. Therefore, it is true that \[ \left[\begin{array}{r} 2 \\ 1 \\ -4 \end{array} \right] \not\in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}. \]
- Since the second augmented column of the RREF is not a pivot column and this augmented column corresponds to System 2, we conclude that System 2 is consistent. From the RREF we can read even more \[ \left[\begin{array}{r} 6 \\ -7 \\ 0 \end{array} \right] = 2 \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} - \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]}. \] Therefore, it is true that \[ \left[\begin{array}{r} 6 \\ -7 \\ 0 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}. \]
- Since the third augmented column of the RREF is not a pivot column and this augmented column corresponds to System 3, we conclude that System 3 is consistent. From the RREF we can read even more \[ \left[\begin{array}{r} -7 \\ -1 \\ 5 \end{array} \right] = \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} + 2 \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]}. \] Therefore, it is true that \[ \left[\begin{array}{r} -7 \\ -1 \\ 5 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}. \]

The questions which are answered in the previous items can be answered by stating a universal problem:
Find a relationship between the coordinates $b_1,$ $b_2$ and $b_3$ such that the following relationship is satisfied: \[ \left[\begin{array}{c} b_1 \\ b_2 \\ b_3 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\}. \]
The answer to this question is very similar to what we did previously. Instead of specific coordinates in the augmented column we do the row reduction with $b_1,$ $b_2$ and $b_3:$ \begin{align*} \left[\!\begin{array}{rr|l} 1 & -4 & b_1 \\ -3 & 1 & b_2 \\ 1 & 2 & b_3 \end{array}\! \right] & \sim \left[\!\begin{array}{rr|l} 1 & -4 & b_1 \\ 0 & -11 & 3b_1 + b_2 \\ 0 & 6 &-b_1 + b_3 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|l} 1 & -4 & b_1 \\ 0 & 1 & -\frac{3}{11} b_1 -\frac{1}{11} b_2 \\ 0 & 6 &-b_1 + b_3 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{rr|l} 1 & 0 & -\frac{1}{11} b_1 - \frac{4}{11} b_2 \\ 0 & 1 & -\frac{3}{11} b_1 - \frac{1}{11} b_2 \\ 0 & 0 & \frac{7}{11}b_1 + \frac{6}{11} b_2 + b_3 \end{array}\! \right] \\ \end{align*} The last matrix is in row echelon form. For the last matrix to be in reduced row echelon form we must have \[ \frac{7}{11}b_1 + \frac{6}{11} b_2 + b_3 = \frac{1}{11} \bigl(7 b_1 + 6 b_2 + 11 b_3\bigr) = 0. \] If \[ \frac{7}{11}b_1 + \frac{6}{11} b_2 + b_3 = \frac{1}{11} \bigl(7 b_1 + 6 b_2 + 11 b_3\bigr) \neq 0, \] then the augmented column of the last matrix is a pivot column and the corresponding system is inconsistent. Therefore we can make the following claim: \[ \left[\begin{array}{c} b_1 \\ b_2 \\ b_3 \end{array} \right] \in \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\} \quad \text{if and only if} \quad 7b_1 + 6 b_2 + 11 b_3 = 0. \] Provided that $7b_1 + 6 b_2 + 11 b_3 = 0$, with the last matrix being in reduced row echelon form, we can even give the following relationship: \[ 11 \left[\begin{array}{c} b_1 \\ b_2 \\ b_3 \end{array} \right] = (- b_1 - 4 b_2) \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]} + (-3 b_1 - b_2) \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]}, \] holds if and only if $7b_1 + 6 b_2 + 11 b_3 = 0.$ The last claim can be verified algebraically.
The picture below is an attempt to illustrate \[ \operatorname{Span}\left\{ \bbox[#50FF50]{\left[\! \begin{array}{r} 1 \\ -3 \\ 1 \end{array} \!\right]}, \bbox[#8080FF]{\left[\! \begin{array}{r} -4 \\ 1 \\ 2 \end{array} \!\right]} \right\} \] geometrically. In the picture below this span is represented by the yellow translucent plane in $\mathbb{R}^3$. In this plane I placed a grid imposed by the green and the blue vector. To indicate that there are many vectors in this plane, I placed nine gray vectors in the picture all of which are in the span. From the calculation in the preceding item we know that a vector $\left[\begin{array}{c} b_1 \\ b_2 \\ b_3 \end{array} \right]$ belongs to this plane if and only if $7b_1 + 6 b_2 + 11 b_3 = 0.$ This equation is an equation of the translucent yellow plane.

We started today's class by discussing the existence and uniqueness question for linear systems. Below I will try to present how we approach answering this question in an organized way.
In the post of Tuesday, January 4, 2021, we discussed the important question of existence and uniqueness of solutions of the linear equation with one unknown $\color{green}{a} \color{red}{x} = \color{green}{b}$ where $\color{green}{a}$ and $\color{green}{b}$ are arbitrary real numbers. Before proceeding please recall our findings.
Now we want to answer the question of existence and uniqueness of solutions of a system of $m$ linear equations each with $n$ unknowns. Here $m$ and $n$ are positive integers. (In set notation, we write $m,n\in \mathbb{N}.$) A linear system with $m$ equations each with $n$ unknowns can be written as follows: \begin{equation} \label{eq:LS} \tag{LS} \require{amsmath} \begin{array}{ccccccccc} \color{green}{a_{11}} \color{red}{x_{1}} & + & \color{green}{a_{12}} \color{red}{x_{2}} & + & \cdots & + & \color{green}{a_{1n}} \color{red}{x_{n}} & = & \color{green}{b_{1}} \\ \color{green}{a_{21}} \color{red}{x_{1}} & + & \color{green}{a_{22}} \color{red}{x_{2}} & + & \cdots & + & \color{green}{a_{2n}} \color{red}{x_{n}} & = & \color{green}{b_{2}} \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \vdots \\ \color{green}{a_{m1}} \color{red}{x_{1}} & + & \color{green}{a_{m2}} \color{red}{x_{2}} & + & \cdots & + & \color{green}{a_{mn}} \color{red}{x_{n}} & = & \color{green}{b_{m}} \end{array} \end{equation} To answer the important questions of existence and uniqueness of the solution of system \eqref{eq:LS} we proceed as follows:
- First: We write the augmented matrix of this system: \begin{equation} \label{eq:AM} \tag{AM} \left[\!\begin{array}{cccc|c} \color{green}{a_{11}} & \color{green}{a_{12}} & \cdots & \color{green}{a_{1n}} & \color{green}{b_{1}} \\ \color{green}{a_{21}} & \color{green}{a_{22}} & \cdots & \color{green}{a_{2n}} & \color{green}{b_{2}} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ \color{green}{a_{m1}} & \color{green}{a_{m2}} & \cdots & \color{green}{a_{mn}} & \color{green}{b_{m}} \\ \end{array}\! \right] \end{equation} It is important to note that the augmented matris of the system \eqref{eq:LS} is completely green matrix with $m$ rows and $n+1$ columns. The last column is the augmented column.
- Second: We find the Reduced Row Echelon Form of the matrix in \eqref{eq:AM}; call this matrix RREF of \eqref{eq:AM}.
- Third: Using the matrix RREF of \eqref{eq:AM} we can answer the existence and uniqueness question as follows:
  The question of the Existence of solution leads to the following dichotomy:
  - Solution does not exist: If the augmented (last) column of the RREF of \eqref{eq:AM} is a pivot column, than the system \eqref{eq:LS} does not have a solution.
  - Solution does exist: If the augmented (last) of the RREF of \eqref{eq:AM} is not a pivot column, than the system \eqref{eq:LS} does have a solution.
  The question of the Uniqueness of solution leads to the following dichotomy:
  - Solution is not unique: If the RREF of \eqref{eq:AM} has at least two columns that are not pivot columns one of which is the augmented (last) column, than the system \eqref{eq:LS} has infinitely many solutions.
  - Solution is unique: If the augmented (last) column of the RREF of \eqref{eq:AM} is not a pivot column and all other columns of this matrix are pivot columns, than the solution of the system \eqref{eq:LS} is unique.
  The two dichotomies presented above give us the following trichotomy:
  - Solution does not exist: the augmented (last) column of the RREF of \eqref{eq:AM} is a pivot column.
  - Solution is not unique: the RREF of \eqref{eq:AM} has at least two columns that are not pivot columns one of which is the augmented (last) column.
  - Solution is unique: the augmented (last) column of the RREF of \eqref{eq:AM} is not a pivot column and all other columns of this matrix are pivot columns

Friday, January 7, 2022 (updated)

What is posted today was previously posted yesterday. I decided to move that content since yesterday I posted too much stuff and today we did not move on to a new topic. Today we discussed more in-depth what is posted below and applications to the questions of existence and uniqueness of solutions of linear systems. The relevant problems are in Section 1.2: 3, 5, 6, 7, 8, 12, 17-31.
In the items below I will define the Reduced Row Echelon Form (RREF) of a matrix.
A useful link is Wikipedia's Row echelon form.
A matrix whose all entries are zero is called a zero matrix. A row of a matrix is said to be a zero row if all the entries in that row are zero. The leftmost nonzero entry of a nonzero row is called the leading entry of a row. The zeros preceding the leading entry are called the leading zeros of a row. All entries of a zero row are leading zeros.
This is a restatement of Wikipidia's definition of reduced row echelon form:
Each zero matrix is in Reduced Row Echelon Form. A nonzero matrix is in Reduced Row Echelon Form if it satisfies the following three conditions:
- Each nonzero row has strictly more leading zeros then the row above it.
- The leading entry of each nonzero row is equal to $1$.
- The leading entry of each nonzero row is the only nonzero entry in its column.
Some consequences of the definition of Reduced Row Echelon Form in the preceding item are:
- If a nonzero matrix is in Reduced Row Echelon Form, then all its zero rows are at the bottom.
- If a nonzero matrix is in Reduced Row Echelon Form, then its first row has the fewest leading zeros. The second row has more leading zeros than the first row, the third row has more leading zeros than the second row, and so on, ...
- Let $m, n, k \in \mathbb{N}.$ If a nonzero $m\!\times\!n$ matrix is in Reduced Row Echelon Form and it has $k$ leading entries, then its columns containing the leading entries are the first $k$ columns of the identity matrix $I_m$ in exactly the same order as in the identity matrix $I_m$. In other words, the pivot columns are the first $k$ columns of the identity matrix $I_m$.
To illustrate the last item ➢ in the preceding box, I will construct an example of a nonzero $5\!\times\!7$ matrix in Reduced Row Echelon Form. First choose the number of of pivot columns. This can be any positive integer between $1$ and $\min\{5,7\} = 5.$ Let us choose $3.$ The pivot columns are the first three columns of the identity matrix $I_5$ \[ \left[ \begin{array}{ccccc} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 1 \\ \end{array} \right], \] that is the columns \[ \left[ \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{array} \right], \qquad \left[ \begin{array}{c} 0 \\ 1 \\ 0 \\ 0 \\ 0 \\ \end{array} \right], \qquad \left[ \begin{array}{c} 0 \\ 0 \\ 1 \\ 0 \\ 0 \\ \end{array} \right]. \] Next we need to choose the positions for these three columns in a $5\!\times\!7$ matrix. Since a $5\!\times\!7$ matrix has $7$ columns we can choose any $3$ positions of the possible positions $\{1,2,3,4,5,6,7\}.$ So choose, $2,$ $4,$ and $7$ \[ \left[ \begin{array}{ccccccc} \Box & 1 & \Box & 0 & \Box & \Box & 0 \\ \Box & 0 & \Box & 1 & \Box & \Box & 0 \\ \Box & 0 & \Box & 0 & \Box & \Box & 1 \\ \Box & 0 & \Box & 0 & \Box & \Box & 0 \\ \Box & 0 & \Box & 0 & \Box & \Box & 0 \\ \end{array} \right] \] Since the $1$s are the leading entries in their rows, we must substitute the $\Box$s preceding $1$s with zeros \[ \left[ \begin{array}{ccccccc} 0 & 1 & \Box & 0 & \Box & \Box & 0 \\ 0 & 0 & 0 & 1 & \Box & \Box & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ \Box & 0 & \Box & 0 & \Box & \Box & 0 \\ \Box & 0 & \Box & 0 & \Box & \Box & 0 \\ \end{array} \right] \] Since the fourth and the fifth row do not have leading entries, they must be zero rows: \[ \left[ \begin{array}{ccccccc} 0 & 1 & \Box & 0 & \Box & \Box & 0 \\ 0 & 0 & 0 & 1 & \Box & \Box & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \] The last matrix is in Reduced Row Echelon Form where $\Box$s can be replaced by arbitrary real numbers. For example \[ \left[ \begin{array}{ccccccc} 0 & 1 & -2 & 0 & 3 & -1 & 0 \\ 0 & 0 & 0 & 1 & 2 & 4 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ \end{array} \right] \]
To illustrate the definition of the reduced row echelon form I made a picture of a $12\times 25$ matrix (that is a matrix with $12$ rows and $25$ columns which is in reduced row echelon form. To emphasize the role of the leading zeros, I colored them blue. To emphasize the role of leading $1$s, I colored them green. I also colored in green the each pivot column. The entries of the matrix that are not specifically written as $0$s or $1$s can be any real numbers. Please notice how blue leading zeros form steps which climb from right to left. Also notice that each stair hosts a pivot position, that is the leading $1$ in a nonzero row.
A very good exercise in understanding RREF (reduced row echelon form) is to write down all $2\!\times\!3$ matrices which are in RREF. There are seven of them. Below I list the matrix with no pivot columns, then all matrices with one pivot column, then the matrices with two pivot columns. Those are all possibilities since each pivot column has pivot position occupied by the leading entry which needs its own row and we are studying matrices with only two rows. In the matrices below stars ($*$) stand for arbitrary real numbers. \begin{align*} & \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \\ & \begin{bmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix},\quad \begin{bmatrix} 0 & 1 & * \\ 0 & 0 & 0 \end{bmatrix},\quad \begin{bmatrix} 1 & * & * \\ 0 & 0 & 0 \end{bmatrix},\\ & \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}, \quad \begin{bmatrix} 1 & * & 0 \\ 0 & 0 & 1 \end{bmatrix}, \quad \begin{bmatrix} 1 & 0 & * \\ 0 & 1 & * \end{bmatrix}, \quad \end{align*}
Here are all possible $3\!\times\!4$ matrices in RREF (reduced row echelon form). There are fifteen of them. \begin{align*} & \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \\ & \begin{bmatrix} 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 0 & 0 & 1 & * \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix},\ \begin{bmatrix} 0 & 1 & * & * \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 1 & * & * & * \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix},\\ & \begin{bmatrix} 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 0 & 1 & * & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 1 & * & * & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 0 & 1 & 0 & * \\ 0 & 0 & 1 & * \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 1 & * & 0 & * \\ 0 & 0 & 1 & * \\ 0 & 0 & 0 & 0 \end{bmatrix}, \ \begin{bmatrix} 1 & 0 & * & * \\ 0 & 1 & * & * \\ 0 & 0 & 0 & 0 \end{bmatrix}, \\ & \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}, \ \begin{bmatrix} 1 & * & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix},\ \begin{bmatrix} 1 & 0 & * & 0 \\ 0 & 1 & * & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}, \ \begin{bmatrix} 1 & 0 & 0 & * \\ 0 & 1 & 0 & * \\ 0 & 0 & 1 & * \end{bmatrix}, \end{align*}
Here are all possible $4\!\times\!3$ matrices in RREF (reduced row echelon form). There are eight of them. \begin{align*} & \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \\ & \begin{bmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \quad \begin{bmatrix} 0 & 1 & * \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix},\quad \begin{bmatrix} 1 & * & * \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \\ & \begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \quad \begin{bmatrix} 1 & * & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \quad \begin{bmatrix} 1 & 0 & * \\ 0 & 1 & * \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}, \\ & \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}. \end{align*}
Notice that in all the matrices listed in the preceding item the bottom row is the zero row. By removing that zero row we would get all possible $3\!\times\!3$ matrices in RREF (reduced row echelon form). Thus, there is eight different types of $3\!\times\!3$ matrices in RREF (reduced row echelon form). This is true in general: If $m\geq n$, than the number of different types of $m\!\times\!n$ matrices in RREF is the same as the number of different types of $n\!\times\!n$ matrices in RREF.
Another good exercise is to think of the above matrices as augmented matrices of systems of equations and state for each corresponding system whether it has: No Solutions (NS), Unique Solution (US), Infinitely Many Solutions (MS).
Another interesting exercise, but for Math 309, is to calculate how many $m\!\times\!n$ matrices are in RREF. Just for the record: Let $k$ be a nonnegative integer and let $m, n$ be positive integers such that $k \leq m \leq n.$ It turns out that there are exactly \[ \displaystyle \binom{n}{k} = \frac{n!}{k!(n-k)!} \] matrices of size $m\!\times\!n$ with $k$ pivots (leading $1$s). The symbol $\displaystyle \binom{n}{k}$ is the standard notation for the binomial coefficient $n$-choose-$k$. Since an arbitrary $m\!\times\!n$ matrix can have $k \in \{0,1,\ldots,m\}$ pivots, it follows that the number of possible RREF types for an $m\!\times\!n$ matrix is \[ \sum_{k=0}^m \binom{n}{k}. \] For example, with $m=3$ and $n=4$, as seen above, there are \[ \binom{4}{0} + \binom{4}{1} + \binom{4}{2} + \binom{4}{3} = 1 + 4 + 6 + 4 = 15. \] possible RREF types of matrices. It is interesting to observe that for an $n\!\times\!n$ matrix there are \[ \sum_{k=0}^n \binom{n}{k} = \binom{n}{0} + \binom{n}{1} + \cdots + \binom{n}{n-1} + \binom{n}{n} = 2^n \] possible RREF types. If you take Math 309 you will understand the background of these formulas.
In conclusion, for an $m\!\times\!n$ matrix with $m\lt n$ there are \[ \sum_{k=0}^m \binom{n}{k} \] possible RREF types and if $m \geq n$ there are \[ 2^n \] possible RREF types.

Thursday, January 6, 2022

This is a long post. The reason is that I repeat all the arguments that were given in class today. Please read all of it, in particular the end.
Read Section 1.1 and Section 1.2. Assigned problems for Section 1.1: 3, 6, 7, 5, 9, 11,12, 16, 17, 18, 21, 22, 23, 24, 25, 27, 31, 33, 34. Assigned problems for Section 1.2: 3, 5, 6, 7, 8, 12, 17-31.
Today we solved the following system of linear equations: \begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= -&&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat*} I will give all the details of this solution and illustrate wth pictures. This system is solved by row reducing its augmented matrix: \begin{align*} \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \\ 2 & -1 & 1 & 8 \\ 3 & 0 & -1 & 3 \end{array}\! \right] & \sim \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \\ 0 & -3 & 5 & 18 \\ 0 & -3 & 5 & 18 \end{array}\! \right] \quad \begin{array}{l} \text{(two row replacements)} \end{array} \\ & \sim \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \\ 0 & -3 & 5 & 18 \\ 0 & 0 & 0 & 0 \end{array}\! \right] \quad \begin{array}{l} \text{(one row replacements)} \end{array} \\ & \sim \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \\ 0 & 1 & -5/3 & -6 \\ 0 & 0 & 0 & 0 \end{array}\! \right] \quad \begin{array}{l} \text{(one row scaling)} \\[-5pt] \text{(and one row replacements to obtain} \\[-6pt] \text{the matrix below)} \end{array} \\ & \sim \left[\!\begin{array}{crr|r} 1 & 0 & -1/3 & 1 \\ 0 & 1 & -5/3 & -6 \\ 0 & 0 & 0 & 0 \end{array}\! \right] \quad \begin{array}{l} \text{This matrix is in Reduced Row Echelon Form.} \\[-5pt] \text{I always like to row reduce a matrix} \\[-5pt] \text{to its RREF - Reduced Row Echelon Form.} \\[-5pt] \text{For the definition of RREF see the end of this post.} \end{array} \end{align*} Why are the matrices presented above row equivalent?
- On the first line, the two matrices are row equivalent since the matrix on the right is obtained from the matrix on the left by performing two row operations. The first row operation was a row replacement: the second row $\left[\!\begin{array}{crr|r} 2 & -1 & 1 & 8 \end{array}\! \right]$ is replaced by the sum of itself and the multiple of the first row $ \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \end{array}\! \right]$ by $-2$: \begin{align*} \left[\!\begin{array}{crr|r} 2 & -1 & 1 & 8 \end{array}\! \right] + (-2) \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \end{array}\! \right] & = \left[\!\begin{array}{crr|r} 2 & -1 & 1 & 8 \end{array}\! \right] + \left[\!\begin{array}{crr|r} -2 & -2 & 4 & 10 \end{array}\! \right] \\ & = \left[\!\begin{array}{crr|r} 2 - 2 & -1 - 2 & 1 + 4 & 8 + 10 \end{array}\! \right] \\ & = \left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right] \end{align*} The second row operation was again a row replacement: the third row $\left[\!\begin{array}{crr|r} 3 & 0 & -1 & 3 \end{array}\! \right]$ is replaced by the sum of itself and the multiple of the first row $ \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \end{array}\! \right]$ by $-3$: \begin{align*} \left[\!\begin{array}{crr|r} 3 & 0 & -1 & 3 \end{array}\! \right] + (-3) \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \end{array}\! \right] & = \left[\!\begin{array}{crr|r} 3 & 0 & -1 & 3 \end{array}\! \right] + \left[\!\begin{array}{crr|r} -3 & -3 & 6 & 15 \end{array}\! \right] \\ & = \left[\!\begin{array}{crr|r} 3 - 3 & 0 - 3 & -1 + 6 & 3 + 15 \end{array}\! \right] \\ & = \left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right] \end{align*} Thinking about the corresponding system of linear equations, what we achieved at this step is that we eliminated unknown $x_1$ from the second and the third equation.
- The matrix on the second line is row equivalent to the matrix above it since it is obtained from the matrix above it by performing a row replacement: the third row $\left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right]$ is replaced by the sum of itself and the second row $ \left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right]$ multiplied by $-1$: \begin{align*} \left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right] + (-1) \left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right] & = \left[\!\begin{array}{crr|r} 0 & 0 & 0 & 0 \end{array}\! \right] \end{align*} Thinking about the corresponding system of linear equations, what we achieved at this step is that we reduced the number of equations from three to two.
- The matrix on the third line is row equivalent to the matrix above itself since it is obtained from the matrix above itself by performing a row scaling: the second row $\left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right]$ is replaced by the multiple of itself $-1/3$: \[ (-1/3) \left[\!\begin{array}{crr|r} 0 & -3 & 5 & 18 \end{array}\! \right] = \left[\!\begin{array}{crr|r} 0 & 1 & -5/3 & -6 \end{array}\! \right] \] Thinking about the corresponding system of linear equations, what we achieved at this step is that we now have the coefficient $1$ with the unknown $x_2$ in the second equation.
- The matrix on the fourth line is row equivalent to the matrix above itself since it is obtained from the matrix above itself by performing a row replacement: the first row $\left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \end{array}\! \right]$ is replaced by the sum of itself and the second row $ \left[\!\begin{array}{crr|r} 0 & 1 & -5/3 & -6 \end{array}\! \right]$ multiplied by $-1$: \begin{align*} \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \end{array}\! \right] + (-1) \left[\!\begin{array}{crr|r} 0 & 1 & -5/3 & -6 \end{array}\! \right] & = \left[\!\begin{array}{crr|r} 1 & 0 & -1/3 & 1 \end{array}\! \right] \end{align*} Thinking about the corresponding system of linear equations, what we achieved at this step is that we eliminated unknown $x_2$ from the first equation.
In conclusion, the matrices \[ \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \\ 2 & -1 & 1 & 8 \\ 3 & 0 & -1 & 3 \end{array}\! \right] \sim \left[\!\begin{array}{crr|r} 1 & 0 & -1/3 & 1 \\ 0 & 1 & -5/3 & -6 \\ 0 & 0 & 0 & 0 \end{array}\! \right] \] are row equivalent since the matrix on the right is obtained from the matrix on the left by performing five row operations: first two are row replacements, then a row replacement again, then row scaling and finally one more row replacement.
Since the matrices \[ \left[\!\begin{array}{crr|r} 1 & 1 & -2 & -5 \\ 2 & -1 & 1 & 8 \\ 3 & 0 & -1 & 3 \end{array}\! \right] \sim \left[\!\begin{array}{crr|r} 1 & 0 & -1/3 & 1 \\ 0 & 1 & -5/3 & -6 \\ 0 & 0 & 0 & 0 \end{array}\! \right] \] are row equivalent, the corresponding systems are equivalent systems. That is, the solution set of the given system is the same as the solution set of the system \begin{alignat*}{8} &x_1 & & && & & - &&1/3 &&\, x_3 &&= &&1\\ & & & &&x_2 & & - &&5/3 &&\, x_3 &&= -&&6 \end{alignat*} In the preceding system there is no restriction on $x_3$; $x_3$ is so-called free variable. I like introducing a new name for the free variable, say $s$. Then the solution of the preceding system is given by \begin{equation*} x_1 = \frac{1}{3} s + 1, \quad x_2 = \frac{5}{3} s - 6, \quad x_3 = s, \end{equation*} where $s$ is an arbitrary real number. In set notation, $s \in \mathbb{R}.$ For example, with $s=3$ we obtain that \[ x_1 = 2, \quad x_2 = -1, \quad x_3 = 3, \] is a solution to the given system.
It is always fun to verify whether the solution realy satisfies the given system. Substitute $x_1 = \frac{1}{3} s + 1,$ $x_2 = \frac{5}{3} s - 6$ and $x_3 = s,$ and see what we get: \begin{alignat*}{8} &(1/3 \, s + 1) & & + &&(5/3 \, s - 6) & & - &&2 &&s &&= 2s - 5 +2s = -&&5\\ 2 &(1/3 \, s + 1) & & - &&(5/3 \, s - 6) & & + && &&s &&= -s + 8 + s = &&8 \\ 3 &(1/3 \, s + 1) & & && & & - && && s &&= s + 3 - s = &&3 \end{alignat*} Hence we found all the solutions of the given system.
Finally, let us illustrate the system \begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= -&&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat*} geometrically in $x_1x_2x_3$-space, that is in the space $\mathbb{R}^3.$ Each equation is represented by a plane. The first equation is represented by the brown plane, the second equation is represented by the blue plane and the third equation is represented by the green plane.

A natural question is: How are row operations reflected in the illustration of the system with planes?

Given System	Systems 2 and 3	"Row Reduced" System
\begin{alignat}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= -&&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat}	\begin{alignat}{8} &x_1 & & + &&x_2 & & - &&\phantom{5/} 2&&x_3 &&= -&&5\\ & & & &&x_2 & & - &&5/3 &&x_3 &&= -&&6 \\ & & & && & & && && && && \end{alignat}	\begin{alignat}{8} &x_1 & & && & & - &&1/3 &&x_3 &&= &&1\\ & & & \phantom{+} &&x_2 & & - &&5/3 &&x_3 &&= -&&6 \\ & & & && & & && && && && \end{alignat}

The systems that correspond to the row equivalent matrices on lines 2 and 3 in the row reduction above are illustrated by the same picture; the planes do not change. The system corresponding to the last matrix in Reduced Row Echelon Form is represented by the simplest planes. What is the simplicity of these planes? The first plane is parallel to the coordinate axis $x_2.$ The second plane is parallel to the coordinate axis $x_1.$ (Being parallel to coordinate ases is a very desirable feature for a plane.)

Let us change the system just by changing one number on the left hand side of the first equation: \begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= &&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat*} The only difference is that in the first equation $-5$ is replaced by $5$. So, the same row operations that we used above will row reduce the corresponding augmented matrix: \begin{align*} \left[\!\begin{array}{crr|r} 1 & 1 & -2 & 5 \\ 2 & -1 & 1 & 8 \\ 3 & 0 & -1 & 3 \end{array}\! \right] & \sim \left[\!\begin{array}{crr|r} 1 & 1 & -2 & 5 \\ 0 & -3 & 5 & -2 \\ 0 & -3 & 5 & -12 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{crr|r} 1 & 1 & -2 & 5 \\ 0 & -3 & 5 & -2 \\ 0 & 0 & 0 & -10 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{crr|r} 1 & 1 & -2 & 5 \\ 0 & 1 & -5/3 & 1/3 \\ 0 & 0 & 0 & 1 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{crr|r} 1 & 0 & -1/3 & 14/3 \\ 0 & 1 & -5/3 & 1/3 \\ 0 & 0 & 0 & 1 \end{array}\! \right] \\ & \sim \left[\!\begin{array}{crr|r} 1 & 0 & -1/3 & 0 \\ 0 & 1 & -5/3 & 0 \\ 0 & 0 & 0 & 1 \end{array}\! \right] \end{align*} The preceding row reduction proves that the following two linear systems are equivalent:

\begin{alignat*}{8} &x_1 & & + &&x_2 & & - &&2 &&x_3 &&= &&5\\ 2 &x_1 & & - &&x_2 & & + && &&x_3 &&= &&8 \\ 3 &x_1 & & && & & - && && x_3 &&= &&3 \end{alignat*}

\begin{alignat*}{8} &x_1 & & && & & - 1/3 && &&x_3 &&= &&0\\ & & & &&x_2 & & - 5/3 && &&x_3 &&= &&0 \\ 0 &x_1 & & + 0 & & x_2 & & + \ 0 && && x_3 &&= &&1 \end{alignat*}

The last equation in the system on the right is written with zeros for us to see that the resulting equation does not have a solution. Hence the above system on the right is inconsistent and thus, the given system is inconsistent.
Below is the corresponding illustration in $\mathbb{R}^3.$

The change in the illustration is that a parallel plane (the new brown plane) has replaced the brown plane in the original illustration. The new brown plane does not pass through the intersection of the blue and green plane. Thus, no point lies in all three planes.

Tuesday, January 4, 2022

We talked about Section 1.1 Systems of Linear Equations. Suggested problems are 3, 6, 7, 5, 9, 11, 12, 16, 17, 18, 21, 22, 23, 24, 25, 27, 31, 33, 34.
The questions of the existence and uniqueness of a solution of a linear system arise in Section 1.1. Here I want to illustrate these two important mathematical questions on the linear equation with one unknown (the simplest linear equation).
Consider the general form of the linear equation with one independent variable \[ \color{green}{a} \color{red}{x} = \color{green}{b}. \] In this equation $\color{red}{x}$ is unknown, while $\color{green}{a}$ and $\color{green}{b}$ are coefficients; that is known, but not specified, real numbers. We seek to find all real numbers which are solutions of $\color{green}{a} \color{red}{x} = \color{green}{b}.$ A real number $s$ is a solution of $\color{green}{a} \color{red}{x} = \color{green}{b}$ if the expression $\color{green}{a} s = \color{green}{b}$ is an algebraic identity.
Two important questions are as follows:
- Question 1: Existence of solution Does the equation $\color{green}{a} \color{red}{x} = \color{green}{b}$ have a solution?
- Question 2: Uniqueness of solution Provided that the equation $\color{green}{a} \color{red}{x} = \color{green}{b}$ has a solution, is the solution unique?
The answers to these questions depend on the numbers (coefficients) $\color{green}{a}$ and $\color{green}{b}$. Below are the answers with a geometric illustration.
The question of the Existence of solution leads to the following dichotomy:
- Solution does not exist: If $\color{green}{a}=0$ and $\color{green}{b} \neq 0,$ than the equation $\color{green}{a} \color{red}{x} = \color{green}{b}$ does not have a solution.
- Solution does exist: If $\color{green}{a}\neq 0$ or $\color{green}{b} = 0,$ than the equation $\color{green}{a} \color{red}{x} = \color{green}{b}$ does have a solution.
In the picture below, the above dichotomy is illustrated by coloring in purple the points $(a,b)$ in the $ab$-plane for which $a=0$ and $b \neq 0,$ (solution does not exist) and by coloring in cyan the points $(a,b)$ in the $ab$-plane for which $a \neq 0$ or $b = 0$ (solution does exist).
The question of the Uniqueness of solution leads to the following dichotomy:
- Solution is not unique: If $\color{green}{a}=0$ and $\color{green}{b} = 0,$ than the solution of the equation $\color{green}{a} \color{red}{x} = \color{green}{b}$ is not unique. In fact, every real number is a solution since $0\cdot s = 0$ is an algebraic identity for all real numbers $s.$
- Solution is unique: If $\color{green}{a}\neq 0,$ than the solution of the equation $\color{green}{a} \color{red}{x} = \color{green}{b}$ is unique. In fact, in this case, the unique solution is the number $\displaystyle \frac{\color{green}{b}}{\color{green}{a}}$.
The preceding dichotomy applies only to the green points in the picture above.
The two dichotomies presented above give us the following trichotomy:
- Solution does not exist: $\color{green}{a}=0$ and $\color{green}{b} \neq 0.$
- Solution is not unique: $\color{green}{a}=0$ and $\color{green}{b} = 0.$
- Solution is unique: $\color{green}{a}\neq 0.$
In the picture below, the above trichotomy is illustrated by coloring in purple the points $(a,b)$ in the $ab$-plane for which $a=0$ and $b \neq 0,$ (solution does not exist), by coloring in teal the point $(a,b)$ in the $ab$-plane for which $a = 0$ or $b = 0$ (solution is not unique), and by coloring in light cyan the points $(a,b)$ in the $ab$-plane for which $a \neq 0.$
Stay tuned for the answers to these questions for systems of linear equations.

Monday, December 27, 2021

The information sheet and the class calendar.
We will start with Section 1.1 Systems of Linear Equations. Suggested problems are 3, 6, 7, 5, 9, 11, 12, 16, 17, 18, 21, 22, 23, 24, 25, 27, 31, 33, 34.
Just reading Section 1.1 you will experience that linear algebra is a subject full of new terminology:
- linear equation,
- coefficients of a linear equation,
- system of linear equations or linear system or just system,
- a solution of a linear system,
- the solution set of a linear system,
- consistent linear system,
- inconsistent linear system,
- equivalent systems,
- coefficient matrix of a system,
- augmented matrix of a system,
- elementary row operations,
- row-equivalent matrices.
Learning the terminology of a field is crucial to getting to understand that field. This is true in general and particularly true in Linear Algebra. Therefore I started a webpage with Glossary of Linear Algebra Terms with my comments.
What is the oldest linear algebra problem?
- Clay tablet VAT 8389 from the Old Babylonian period, from 2000 to 1600 BC, contains what is believed to be the earliest word problem that can be interpreted as a system of linear equations:
  
  Total area of two fields is 1800 sar, the rent for one is 2 silà of grain per 3 sar, for the other is 1 silà per 2 sar, the total rent on the first exceeds that on the other by by 500 silà. What is the area of each plot?
  
  This blog has the picture of clay tablet VAT 8389 and more details about it.
  
  A translation of this word problem into a system of linear equations is as follows: \begin{alignat*}{4} &x_1 & &\ + &x_2 & = 1800 \\ \tfrac{2}{3} &x_1 & &- \tfrac{1}{2} &x_2 & = \phantom{1}500. \end{alignat*}
- Problem 40 of the Rhind papyrus which is dated to 1550 BC is:
  
  Divide 100 hekats of barley among 5 men so that the common difference is the same and so that the sum of the two smallest is 1/7 the sum of the three largest.
  
  Historians suspect that the scribe Ahmes copied the Rhind papyrus from a now-lost text from around 1850 BC. Further, this lost text might have been copied from an even older text from around 2500 BC. Therefore, the preceding problem could be the oldest known linear algebra problem.
  
  Denote by $x_1$ the smallest number and by $x_2$ the common difference. After simplification, the above problem translates into the following system of linear equations: \begin{alignat*}{5} 5 &x_1 & & + 10 &x_2 & = 100 \\ \tfrac{11}{7} &x_1 & & - \phantom{1}\tfrac{2}{7} &x_2 & = \phantom{10}0. \end{alignat*}
- Most importantly for us, the oldest known treatment of systems of linear equations from antiquity which resembles the methods that we will use in this class is in Chapter 8 of the Chinese textbook Nine Chapters of the Mathematical Art which is at least 1800 years old.
  
  From 3 top-grade rice paddies, 2 medium-grade, and 1 low-grade, the combined yield is 39 dou of grain. From 2 top-grade, 3 medium-grade, and 1 low-grade, the combined yield is 34 dou of grain. From 1 top-grade, 2 medium-grade, and 3 low-grade, the combined yield is 26 dou of grain. How much dou does one bundle of each grade yield?
  
  Denote by $x_1$ the yield of the top-grade rice paddy, by $x_2$ the yield of the medium-grade, and by $x_3$ the yield of the low-grade rice paddy. Then the above problem translates into the following system of linear equations: \begin{alignat*}{7} 3 &x_1 & & + 2 &x_2 & + & x_3 = 39 \\ 2 &x_1 & & + 3 &x_2 & + & x_3 = 34 \\ &x_1 & & + 2 &x_2 & + 3 & x_3 = 26 \end{alignat*}
If the history of mathematics might inspire you to study mathematics with more enthusiasm, below I link to some websites with more about the history of Linear Algebra.
- Early History of Linear Algebra by Roger Hart
- History of matrices
- History of abstract vector spaces
- Solving a System of Linear Equations Using Ancient Chinese Methods by Mary Flagg
My comment on the history of mathematics:

Different civilizations have created mathematical knowledge throughout history and sometimes passed that knowledge among themselves. A significant aspect of the growth of mathematical knowledge was that succeeding civilizations recognized the value of the knowledge created by preceding civilizations. Just the fact that I am writing this here is my recognition and appreciation of the contribution of the ancient civilizations to Linear Algebra.

Winter 2022 MATH 204: Introduction to Linear Algebra Branko Ćurgus

Winter 2022
MATH 204: Introduction to Linear Algebra
Branko Ćurgus