Linear Algebra: Theory & Applications

Updated at:02/04/2025

Document information

Author	Kenneth Kuttler
School	Brigham Young University
Major	Linear Algebra
Document type	Textbook

Language	English
Format	\| PDF
Size	8.01 MB

Summary

I.Functions and Fields in Linear Algebra

This section introduces the fundamental concept of a function in mathematics, defining it as a rule assigning a unique output to every input. It then establishes the importance of fields (like real and complex numbers) as the sets from which scalars are drawn, crucial for linear algebra operations. The properties of injective and surjective functions are also defined.

1.1. Definition and Properties of Functions

The section begins by defining a function as a rule that assigns a unique output from a set R (codomain) to each element of a set D (domain). The notation f: D → R is introduced to represent this relationship, where f denotes the function itself. A crucial point is that the image of f (the set of all outputs) is a subset of the codomain R. The text distinguishes between the terms 'range' (sometimes used synonymously with image) and 'codomain', emphasizing that the codomain is the set where the outputs could be, while the image is the set of actual outputs. Two key properties of functions are defined: a function is called 'onto' or 'surjective' if its image equals its codomain, meaning every element in R is an output for some input; a function is called 'one-to-one' or 'injective' if distinct inputs always produce distinct outputs (f(x) ≠ f(y) whenever x ≠ y). These definitions are fundamental to understanding the behavior of functions in mathematics and form the base for numerous advanced concepts in linear algebra and other mathematical areas.

1.2. Fields and Scalars in Linear Algebra

The section then transitions to discuss the concept of a field in the context of linear algebra. It explains that real and complex numbers form fields, and the elements of these fields are called scalars. This foundational concept is crucial because linear algebra operations such as addition, subtraction, multiplication, and division are conducted using these scalars. The text emphasizes that while real and complex numbers are commonly used scalars, the principles of linear algebra are applicable to any field. For greater generality, the symbol F is used to represent the field of scalars; in cases of ambiguity, the field of complex numbers is assumed. The significance of algebraically complete fields, specifically that of complex numbers, is noted. This property means that every polynomial equation with complex coefficients has at least one complex root, this property is essential in many aspects of linear algebra theorems and applications. This general approach ensures that the concepts and theorems of linear algebra are not limited to specific number systems but are applicable to a wide range of mathematical structures, extending its utility to various branches of mathematics and science.

II.Systems of Equations and Row Operations

The core of this section focuses on solving linear systems of equations. It explains how to represent these systems using augmented matrices and manipulate them using row operations to find solutions. It highlights that row operations—including switching rows, multiplying rows by scalars, and adding multiples of one row to another—do not alter the solution set of the system. The implications of these operations for solving systems with different numbers of equations and variables are also discussed.

2.1 Augmented Matrices and Row Operations

This section introduces the use of augmented matrices to represent systems of linear equations. An augmented matrix is a matrix formed by combining the coefficient matrix of a system of linear equations with its constant vector. The core idea is that manipulating the augmented matrix using row operations does not change the solution set of the corresponding system of equations. The text explains three fundamental row operations: switching two rows, multiplying a row by a non-zero scalar, and adding a multiple of one row to another row. Each of these operations, when performed on the augmented matrix, corresponds to a valid transformation of the original system of equations without altering its solutions. This method provides a systematic and efficient way to solve systems of linear equations, reducing the complexity of calculations and providing a clear path to a solution. The text emphasizes that this approach is applicable to systems with any number of equations and variables; the procedure remains consistent regardless of the size or structure of the system.

2.2. Reversible Row Operations and Solution Uniqueness

Building upon the introduction of row operations, this section demonstrates that every row operation has an inverse. This means that any transformation done to the augmented matrix using a row operation can be undone by applying a corresponding inverse operation. For example, switching rows is its own inverse, while multiplying a row by a scalar is reversed by multiplying the same row by the inverse of the scalar. Adding a multiple of one row to another is reversed by subtracting that same multiple. This reversibility is key because it ensures that the process of using row operations to simplify the augmented matrix is not a destructive one. The reversibility guarantees that the original and simplified systems are equivalent, having precisely the same set of solutions. This concept of reversible operations is significant because it implies that the solution obtained is unique (if a unique solution exists); the sequence of row operations used may vary, but the resulting solution will always be the same. Thus, any simplification using row operations preserves the integrity of the initial system, providing reliable solutions.

2.3. Possible Solution Types for Linear Systems

The section explains that linear systems can have three possible solution types: a unique solution, no solution, or infinitely many solutions. This trichotomy is a crucial aspect of the theory of linear equations. By systematically applying row operations to simplify the augmented matrix, one can determine which of these solution types applies to a given linear system. The process of using row operations to reach a simpler form from which the solution(s) can be read directly is detailed. The key observation is that the row operations, being reversible, do not alter the solution set. Regardless of the relative number of equations and variables in a linear system (more equations than variables, fewer equations than variables, or an equal number), the process of setting up the augmented matrix and applying row operations remains the same. The consistency of this approach highlights the elegance and power of the method of representing and solving linear systems with augmented matrices.

III.Vectors and the Inner Product in F sup n sup

This section introduces vectors in Fⁿ (where F is typically the field of real or complex numbers), defining them as ordered lists of n elements. The concept of an inner product is introduced along with its axioms. The significance of the Cauchy-Schwarz inequality related to the inner product is presented, connecting it to the concept of scalar multiples of vectors.

3.1 Vectors in F sup n sup

This section defines vectors in Fⁿ as ordered lists of n elements, where each element (coordinate) belongs to a field F, typically the real (R) or complex (C) numbers. The notation Fⁿ represents the set of all such vectors. Equality of vectors is defined: two vectors are equal if and only if their corresponding coordinates are equal. For convenience, a single boldface letter (e.g., x) is used to represent a vector (x₁, x₂, ..., x_n). This consistent notation simplifies the representation and manipulation of vectors in subsequent mathematical discussions and calculations. The introduction of Fⁿ expands the concept of vectors beyond the typical real-valued vectors, acknowledging the broader applicability of these principles across various fields within mathematics and related sciences. This generalization lays the foundation for a more extensive exploration of vector properties, operations, and their uses within the framework of linear algebra.

3.2 The Inner Product in F sup n sup

The section introduces the inner product in Fⁿ, a crucial concept in linear algebra that extends the notion of the dot product to vector spaces over general fields. The inner product is defined as a function that takes two vectors as input and returns a scalar. It must satisfy several axioms, including linearity, symmetry, and positive definiteness. These axioms define the fundamental properties of the inner product. One of the most important consequences of the inner product axioms is the Cauchy-Schwarz inequality. This inequality states that the absolute value of the inner product of two vectors is always less than or equal to the product of their norms. The equality condition in the Cauchy-Schwarz inequality is explored, stating that equality holds if and only if one vector is a scalar multiple of the other. This condition provides valuable insight into the geometric relationship between vectors in the vector space. The careful exploration of the inner product provides essential tools and relationships that are necessary for many important aspects of linear algebra.

IV.Matrices and Linear Transformations

This section dives into matrices, emphasizing their use in representing and solving linear systems. The crucial operation of matrix multiplication is defined along with the important note that the order of multiplication matters significantly. The concepts of row and column vectors, along with their interpretations are explained. The relation between matrices and linear transformations is established.

4.1 Introduction to Matrices

This section introduces matrices as rectangular arrays of numbers (scalars), emphasizing their importance beyond solving linear systems of equations. Matrices are described as rectangular arrays of numbers, also called scalars, which are elements of a field (often real or complex numbers). The text notes that the convention for identifying elements within a matrix is to list the rows first, followed by the columns, drawing an analogy to columns in a Greek temple. The notation (a_ij) is introduced to refer to a matrix where 'i' represents the row and 'j' represents the column. Specific examples are given to illustrate how to locate elements within a matrix using this row-column indexing system. This section serves as a foundational introduction to the structure and representation of matrices, setting the stage for later discussions of matrix operations, linear transformations, and other key concepts in linear algebra.

4.2 Matrix Multiplication

The section focuses on matrix multiplication, a crucial operation in linear algebra. It emphasizes the importance of checking the dimensions of matrices before attempting multiplication: the number of columns in the first matrix must equal the number of rows in the second matrix. This dimensionality requirement stems from the underlying definition of matrix multiplication. A clear explanation is given for why incompatible dimensions result in an impossible multiplication, emphasizing the necessity of verifying compatibility before attempting any calculations. This concept underscores a fundamental aspect of matrix operations, highlighting the importance of careful consideration of matrix dimensions in calculations. The text further illustrates the non-commutative nature of matrix multiplication—the order in which matrices are multiplied affects the result—through an example where switching the order of multiplication yields a completely different outcome. This non-commutative property is a key distinction between matrix multiplication and scalar multiplication.

4.3. Matrices and the Coriolis Effect Illustrative Example

This section delves into a practical example illustrating the use of matrices and their connection to physical phenomena. It describes an amusement park ride where individuals experience centrifugal and Coriolis forces. This example is used to help visualize and understand how different coordinate systems and their respective accelerations are represented and calculated using matrices and linear algebra. The text then shifts to a more complex scenario—the Coriolis effect on the rotating Earth. The Coriolis acceleration on the rotating earth is explained and illustrated using a pendulum experiment from 1851 performed by Foucault. This illustrates that the earth’s rotation affects a freely moving object in predictable ways that can be mathematically calculated using matrices. While the focus is on this specific application, it subtly highlights the broader relevance of matrices and linear algebra in diverse scientific disciplines, connecting theoretical concepts to practical, real-world situations.

V.Determinants and Eigenvalues

This section explains the concept of the determinant of a square matrix, highlighting its computational complexity for larger matrices. The concept of eigenvalues and eigenvectors of a matrix is introduced. A method for finding these using the equation det(λI - A) = 0, where A is the matrix and λ represents eigenvalues, is described. It emphasizes the difficulty of finding eigenvalues, particularly for large matrices.

5.1. The Determinant Computational Challenges

This section introduces the determinant of a square matrix, highlighting the computational difficulties involved in calculating it, especially for large matrices. The text points out that for an n x n matrix, the number of terms required to compute the determinant using a standard method grows factorially (n!). This factorial growth makes the calculation computationally expensive, even for moderately sized matrices. For instance, a 10 x 10 matrix requires 10! = 3,628,800 terms. The text also raises questions about the well-defined nature of the determinant; it's not immediately obvious why expanding along any row or column should yield the same result. This point emphasizes the non-intuitive nature of the determinant calculation. While the method for calculation is presented, the section's focus is on the computational challenges and the underlying theoretical implications. This setup motivates the need for more sophisticated methods and a deeper understanding of the properties of determinants.

5.2 Eigenvalues and Eigenvectors Definition and Properties

The section introduces the concepts of eigenvalues and eigenvectors. An eigenvector of a square matrix A is defined as a non-zero vector x such that Ax = λx for some scalar λ (eigenvalue). The text highlights that not all scalars are eigenvalues of a given matrix. The key insight provided is that if Ax = λx holds, then (λI - A)x = 0, where I is the identity matrix. This implies that (λI - A) is a singular matrix (not one-to-one and not onto), leading to the crucial condition det(λI - A) = 0. This equation, the characteristic equation, is a polynomial equation of degree n (for an n x n matrix) and its roots are the eigenvalues of the matrix. This equation is a cornerstone of the eigenvalue problem, linking the concept of eigenvalues to the determinant. The text underscores the significance of this connection, emphasizing the difficulty in solving this polynomial equation to find eigenvalues, particularly for large matrices. This challenge is presented as the driving force behind further exploration of numerical and theoretical methods for dealing with eigenvalues.

VI.Row Reduced Echelon Form and Matrix Factorization

This part details the row reduced echelon form of a matrix, an important form used in solving linear systems. The process of achieving this form through row operations is described. The concept of row equivalence and its implications are presented. The section also discusses LU factorization, a method for decomposing a matrix into lower and upper triangular matrices (L and U respectively). The method is described for matrices that can be row reduced without row swapping.

6.1 Row Reduced Echelon Form Algorithm and Uniqueness

This section describes the algorithm for obtaining the row reduced echelon form (RREF) of a matrix using row operations. The algorithm involves systematically selecting a pivot column (the first non-zero column from the left), creating a leading 1 in that column using row operations, and then using that leading 1 to zero out all other entries in that column. This process is repeated on the submatrix to the right and below the leading 1. The algorithm continues until all columns have been processed. The resulting matrix is the row reduced echelon form. The text highlights that although there are choices involved in performing row operations at each step, the final row reduced echelon form is unique for a given matrix. This uniqueness is crucial because it guarantees consistency in results regardless of the particular steps taken in the reduction process. The method of achieving RREF ensures that the final form is standardized, making it a fundamental tool for various matrix manipulations and analyses. Row equivalence of matrices is also defined, stating that two matrices are row equivalent if one can be obtained from the other through a sequence of row operations.

6.2 LU Factorization The Multiplier Method

The section explores LU factorization, a method for decomposing a matrix into a lower triangular matrix (L) and an upper triangular matrix (U). The text focuses on the 'multiplier method,' which involves using row operations to reduce the matrix A to an upper triangular matrix U. This reduction process implicitly generates the lower triangular matrix L. It emphasizes that the multiplier method functions effectively when the row reduced echelon form can be obtained using only row operations that involve adding multiples of one row to another, without row switching. The section explains how the sequence of elementary matrices that reduce A to U can be structured in a way that facilitates the construction of L, particularly highlighting the importance of the sequence of row operations for obtaining the factors. It further delves into a more rigorous understanding of why this method works, relating it to the properties of elementary matrices and showing the relationship between the row operations, the resulting upper triangular matrix and the implied lower triangular matrix. This understanding is key for efficiently and correctly constructing LU factorizations.

VII.Linear Programming and the Simplex Method

This section introduces linear programming, focusing on maximizing or minimizing a linear objective function subject to linear constraints. It presents a geometric interpretation for problems with few variables and then describes the simplex method, an algorithm for solving linear programming problems with many variables. The concepts of basic feasible solutions, simple and non-simple columns, and pivot elements are explained, emphasizing its importance for higher dimensional problems where drawing pictures is not useful.

7.1 Simple Geometric Considerations in Linear Programming

This section introduces linear programming problems, which involve maximizing or minimizing a linear function subject to linear inequality constraints. A simple example of a hamburger store with two types of burgers (big stack and basic burger) and limited patties is used to illustrate the concept. The problem is represented graphically in two dimensions, visualizing the feasible region (the set of points satisfying all constraints). The objective function is represented by a line, and the optimal solution is found by identifying the point within the feasible region that maximizes the objective function (in this case, profit). This graphical method works well with two variables but is not practical for higher-dimensional problems, which motivates the need for the simplex method. The two-dimensional illustration clarifies the geometric intuition behind linear programming, while simultaneously highlighting its limitations in more complex scenarios, setting the stage for the introduction of a more general and efficient solution method.

7.2 The Simplex Method Basic Feasible Solutions and Tableaus

The section introduces the simplex method, an algorithm for solving linear programming problems. It introduces the concept of a simplex tableau, a tabular representation of the linear programming problem. Basic feasible solutions are easily identified in a simplex tableau by setting variables corresponding to non-simple columns to zero. Variables corresponding to simple columns are determined by non-negative entries in the rightmost column. The text explains how the entry in the bottom-right corner of a simplex tableau represents the value of the objective function for the obvious basic feasible solution. It emphasizes the process of pivoting—selecting a pivot element and performing row operations—to move from one basic feasible solution to another, iteratively improving the objective function value. The manipulation of the simplex tableau, including the identification of simple and non-simple columns, serves as the core of the algorithm, leading to the optimal solution. This systematic approach makes the simplex method suitable for solving high-dimensional problems where the graphical method is infeasible.

7.3 Finding a Basic Feasible Solution The Method of Artificial Variables

This section addresses the challenge of finding an initial basic feasible solution, crucial for starting the simplex method. The method of artificial variables is introduced as a systematic approach to overcome this challenge. The method involves adding artificial variables to the constraints to ensure a readily available initial basic feasible solution. The text shows that finding a feasible solution to the original problem is equivalent to minimizing the sum of the artificial variables subject to the augmented constraints. This transformation allows the simplex method to begin with an easily found initial solution. The minimization of artificial variables provides a pathway to finding a basic feasible solution for the original linear programming problem, establishing a link between this method and the initial setup for the simplex method. This section makes linear programming practical in situations where an initial basic feasible solution is difficult to identify directly.

VIII.Applications of Eigenvalues and Advanced Theorems

This section explores applications of eigenvalues and eigenvectors, particularly in areas such as continuum mechanics and the deformation of materials. It discusses the deformation gradient, relating it to the concepts presented. The section concludes with advanced theorems (like Gerschgorin's theorem) providing insights into eigenvalue properties and the behavior of eigenvalues under small changes in the matrix. The continuity of eigenvalues is presented.

8.1 Applications of Eigenvalues and Eigenvectors in Mechanics

This section illustrates the application of eigenvalues and eigenvectors in continuum mechanics. It introduces the deformation gradient (F), a linear transformation describing the local deformation of a material. This deformation is analyzed by decomposing F into two parts: one representing rotation and the other representing stretching/compression. The matrix U, responsible for stretching and compression, is known as the right Cauchy-Green strain tensor. In this context, eigenvectors of U determine the principal directions—directions of maximum stretching or compression. The application to mechanics highlights the importance of eigenvectors in identifying principal directions within a deformed material, providing valuable insights into the material's behavior under stress. The text suggests that this decomposition of a matrix into components related to rotation and distortion is also applicable in geometric measure theory and the study of quadratic forms, emphasizing the wide reach of linear algebra principles across mathematical fields.

8.2 Eigenvalues and Systems of Differential Equations

This section demonstrates the application of eigenvalues and eigenvectors in solving systems of differential equations. It shows that eigenvectors corresponding to different eigenvalues can be used to construct solutions for systems of linear differential equations. By finding the eigenvalues and eigenvectors of the coefficient matrix, one can obtain closed-form solutions, described as oscillatory in nature, for the system. This method is presented as a way to obtain precise solutions using known functions, although it involves transforming a relatively easy problem in analysis (solving differential equations) into a more challenging algebraic problem (finding eigenvalues and eigenvectors). The application illustrates a specific method for constructing oscillatory solutions, emphasizing the utility of the eigenvalue problem in providing exact solutions within a specific context. This method hints at broader implications of linear algebra in analytical problem-solving across various domains.

8.3 Advanced Theorems Gerschgorin s Theorem and Eigenvalue Continuity

The section introduces advanced theorems related to eigenvalues. Gerschgorin's theorem provides a way to estimate the location of eigenvalues by using circular regions in the complex plane. The text points out that while direct calculation of eigenvalues might be feasible for small matrices, this approach becomes computationally impractical for large matrices, which highlights the importance of estimation techniques. Gerschgorin's theorem, therefore, acts as a tool for quickly estimating the range of eigenvalues. The section further delves into a discussion of eigenvalue continuity, exploring how eigenvalues behave as a function of continuous changes in the matrix elements. This theorem guarantees that small changes in a matrix result in small changes in the eigenvalues, provided certain conditions are met. This theorem implies that even if direct calculation of eigenvalues is computationally challenging, it’s still possible to determine with some accuracy the general location and behavior of the eigenvalues without performing the full computation, making it a crucial result for analysis.