The Linear Algebra View of Calculus: Taking a Derivative with a Matrix
September 4, 2019
Most people think of linear algebra as a tool for solving systems of linear equations. While it definitely helps with that, the theory of linear algebra goes much deeper, providing powerful insights into many other areas of math.
In this post I’ll explain a powerful and surprising application of linear algebra to another field of mathematics — calculus. I’ll explain how the fundamental calculus operations of differentiation and integration can be understood instead as a linear transformation. This is the “linear algebra” view of basic calculus.
Taking Derivatives as a Linear Transformation
In linear algebra, the concept of a vector space is very general. Anything can be a vector space as long as it follows two rules.
The first rule is that if u and v are in the space, then u + v must also be in the space. Mathematicians call this “closed under addition.” Second, if u is in the space and c is a constant, then cu must also be in the space. This is known as “closed under scalar multiplication.” Any collection of objects that follows those two rules — they can be vectors, functions, matrices and more — qualifies as a vector space.
One of the more interesting vector spaces is the set of polynomials of degree less than or equal to n. This is the set of all functions that have the following form:
where a0…an are constants.
Is this really a vector space? To check, we can verify that it follows our two rules from above. First, if p(t) and q(t) are both polynomials, then p(t) + q(t) is also a polynomial. That shows it’s closed under addition. Second, if p(t) is a polynomial, so is c times p(t), where c is a constant. That shows it’s closed under scalar multiplication. So the set of polynomials of degree at most n is indeed a vector space.
Now let’s think about calculus. One of the first methods we learn is taking derivatives of polynomials. It’s easy. If our polynomial is ax^2 + 3x, then our first derivative is 2ax + 3. This is true for all polynomials. So the general first derivative of an nth degree polynomial is given by:
The question is: is this also a vector space? To answer that, we check to see that it follows our two rules above. First, if we add any two derivatives together, the result will still be the derivative of some polynomial. Second, if we multiply any derivative by a constant c, this will still be the derivative of some polynomial. So the set of first derivatives of polynomials is also a vector space.
Now that we know polynomials and their first derivatives are both vector spaces, we can think of the operation “take the derivative” as a rule that maps “things in the first vector space” to “things in the second vector space.” That is, taking the derivative of a polynomial is a “linear transformation” that maps one vector space (the set of all polynomials of degree at most n) into another vector space (the set of all first derivatives of polynomials of degree at most n).
If we call the set of polynomials , then the set of derivatives of this is , since taking the first derivative will reduce the degree of each polynomial term by 1. Thus, the operation “take the derivative” is just a function that maps . A similar argument shows that “taking the integral” is also a linear transformation in the opposite direction, from .
Once we realize differentiation and integration from calculus is really just a linear transformation, we can describe them using the tools of linear algebra.
Here’s how we do that. To fully describe any linear transformation as a matrix multiplication in linear algebra, we follow three steps.
Next, we feed each element of this basis through the linear transformation, and see what comes out the other side. That is, we apply the transformation to each element of the basis, which gives the “image” of each element under the transformation. Since every element of the domain is some combination of those basis elements, by running them through the transformation we can see the impact the transformation will have on any element in the domain.
Finally, we collect each of those resulting images into the columns of a matrix. That is, each time we run an element of the basis through the linear transformation, the output will be a vector (the “image” of the basis element). We then place these vectors into a matrix D, one in each column from left to right. That matrix D will fully represent our linear transformation.
where at a0…a3 are constants. When we apply our transformation, “take the derivative of this polynomial,” it will reduce the degree of each term in our polynomial by one. Thus, the transformation D will be a linear mapping from to , which we write as .
To find the matrix representation for our transformation, we follow our three steps above: find a basis for the domain, apply the transformation to each basis element, and compile the resulting images into columns of a matrix.
First we find a basis for . The simplest basis is the following: 1, t, t^2, and t^3. All third-degree polynomials will be some linear combination of these four elements. In vector notation, we say that a basis for is given by:
Now that we have a basis for our domain , the next step is to feed the elements of it into the linear transformation to see what it does to them. Our linear transformation is, “take the first derivative of the element.” So to find the “image” of each element, we just take the first derivative.
The first element of the basis is 1. The derivative of this is just zero. That is, the transformation D maps the vector (1, 0, 0, 0) to (0, 0, 0). Our second element is t. The derivative of this is just one. So the transformation D maps our second basis vector (0, t, 0, 0) to (1, 0, 0). Similarly for our third and fourth basis vectors, the transformation maps (0, 0, t^2, 0) to (0, 2t, 0), and it maps (0, 0, 0, t^3) to (0, 0, 3t^2).
Applying our transformation to the four basis vectors, we get the following four images under D:
Now that we’ve applied our linear transformation to each of our four basis vectors, we next collect the resulting images into the columns of a matrix. This is the matrix we’re looking for — it fully describes the action of differentiation for any third-degree polynomial in one simple matrix.
Collecting our four image vectors into a matrix, we have:
This matrix gives the linear algebra view of differentiation from calculus. Using it, we can find the derivative of any polynomial of degree three by expressing it as a vector and multiplying by this matrix.
To find its derivative, we simply multiply this vector by our D matrix from above:
which is exactly the first derivative of our polynomial function!
This is a powerful tool. By recognizing that differentiation is just a linear transformation — as is integration, which follows a similar argument that I’ll leave as an exercise — we can see it’s really just a rule that linearly maps functions in to functions in .
In fact, all m x n matrices can be understood in this way. That is, an m x n matrix is just a linear mapping that sends vectors in into . In the case of the example above, we have a 3 x 4 matrix that sends polynomials in (such as ax^3 + bx^2 + cx +d, which has four elements) into the space of first derivatives in (in this case, 3ax^2 + 2bx +c, which has three elements).
For more on linear transformations, here’s a useful lecture from MIT’s Gilbert Strang.