Poor Chen Xi, who told you to go to this minefield? ! Colors make you dizzy!
The course of linear algebra, whether you start directly from determinant or matrix, is full of puzzling from the beginning. For example, the textbook Tongji Linear Algebra (now the fourth edition), which is the most widely used in the teaching of general engineering colleges in China, introduced the strange concept of "unprecedented, no one came after" at first, and then gave an extremely intuitive determinant definition with inverse numbers, followed by some simple and stupid determinant properties and exercises-this line is multiplied by a coefficient, added to another line, and then subtracted from that column to toss it over. Most students with average qualifications like me are a little dizzy when they come here: even if they don't know what this is, they start to perform in the circle of fire, which is too "nonsense"! So some people began to skip classes, and more people began to copy their homework. This is the trick, because the subsequent development can be described by a turning point, followed by this nonsense determinant, which is the appearance of an equally nonsense but great guy-the matrix is coming! Many years later, I realized what a tragic scene my math career opened when my teacher foolishly put a bunch of silly numbers in brackets and said slowly, "This thing is called a matrix"! Since then, this guy Matrix has never been absent from almost everything that has a little relationship with the word "learning". For a fool like me who can't solve linear algebra at once, the boss of the matrix comes uninvited, which often makes me feel ashamed and collapse. For a long time, I saw The Matrix in reading, just like Ah Q saw a fake foreign devil, rubbing his forehead and making a detour.
Actually, I am not a special case. General engineering students usually find it difficult to learn linear algebra for the first time. This situation is very common at home and abroad. Swedish mathematician Lars Gardin said in his masterpiece Mathematical Encounter: "If you are not familiar with the concept of linear algebra and want to study natural science, it seems almost illiterate now." However, "according to the current international standards, linear algebra is expressed by axiomatization, which is the second generation mathematical model, ..., which brings difficulties to teaching." In fact, when we began to learn linear algebra, we unconsciously entered the category of "second generation mathematical model", which means that the expression and abstraction of mathematics have evolved in an all-round way. For those of us who have been studying in the "first generation mathematical model", that is, the practical-oriented and concrete mathematical model since childhood, it is strange that we do not find it difficult to make such a drastic paradigm shift without clear notice.
Most engineering students can gradually understand and skillfully use linear algebra after learning some follow-up courses, such as numerical analysis, mathematical planning, matrix theory and so on. Even so, although many people can skillfully use linear algebra as a tool for scientific research and application, they are not clear about the seemingly basic questions raised by many beginners of this course. For example:
* What is a matrix? A vector can be regarded as a representation of an object with n independent attributes (dimensions). What is a matrix? If we think that a matrix is an expansion of a new composite vector composed of a set of column (row) vectors, then why is this expansion so widely used? Especially why is two-dimensional expansion so useful? If every element in the matrix is a vector, is it more useful to expand it again and turn it into a three-dimensional cube?
* Why does the multiplication rule of matrix stipulate this? Why can such a strange multiplication rule play such a big role in practice? Isn't it wonderful that many seemingly completely unrelated problems finally boil down to matrix multiplication? Under the seemingly inexplicable rule of matrix multiplication, are there some essential laws of the world? If so, what are these essential laws?
* What exactly is a determinant? Why are there such strange calculation rules? What is the essential relationship between determinant and its corresponding square matrix? Why only a square matrix has a corresponding determinant, but the general matrix doesn't (don't think this question is stupid, if necessary, it's not impossible to define a determinant for m×n matrix, but it's unnecessary because it's unnecessary, but why not)? Moreover, the calculation rules of determinant seem to have no intuitive connection with any calculation rules of matrix. Why does it determine the properties of matrices in many ways? Is all this just a coincidence?
* Why can the matrix be calculated in blocks? Block calculation seems so random, why is it feasible?
* For matrix transposition operation at, there is (AB)T = BTAT, and for matrix inversion operation A- 1, there is (ab)-1= b-1. Why do two seemingly unrelated operations have similar properties? Is this just a coincidence?
* Why is the matrix obtained by P- 1AP "similar" to the matrix A? What does "similarity" mean here?
* What is the essence of eigenvalues and eigenvectors? Their definition is surprising, because Ax =λx, the effect of a large matrix is exactly equivalent to a small number λ, which is really amazing. But why use "characteristics" or even "internal" to define it? What exactly are they depicting?
This kind of problem often makes people who have used linear algebra for many years feel embarrassed. Just as adults always say "that's it, that's it" when facing children's questions, many veterans can only prevaricate with "this is the rule, just accept and remember". However, if such questions can't be answered, linear algebra is a set of rough, unreasonable and inexplicable rules for us, and we will feel that we are not studying a science, but being thrown into an unhelpful compulsory world, forced to rush on under the whip of the exam, and unable to appreciate its beauty, harmony and unity. It was not until many years later that we found this knowledge so useful, but we were still confused: what a coincidence?
I think this is the result of the lack of intuition in our linear algebra teaching. These questions involving "how to" and "how to" can't satisfy the questioner only by pure mathematical proof. For example, if you demonstrate that the matrix block operation is really feasible through the general proof method, then this will not solve the questioner's doubts. Their real confusion is: why is matrix block operation feasible? Is it just a coincidence, or is it necessarily determined by some property of the object matrix? If it is the latter, what are these natures of the matrix? As long as you consider the above problems a little, you will find that these problems can not be solved by mathematical proof alone. Just like our textbooks, everything is proved by mathematics. In the end, students can only use tools skillfully, but they lack real understanding.
Since the rise of Bourbaki School in France in the 1930 s, the axiomatic and systematic description of mathematics has achieved great success, which greatly improved the rigor of the mathematics education we received. However, a controversial side effect of axiomatization of mathematics is the loss of intuition in general mathematics education. Mathematicians seem to think that intuition and abstraction are contradictory, so they sacrifice the former. However, many people, including myself, are skeptical about this. We believe that intuition and abstraction are not necessarily contradictory, especially in mathematics education and mathematics textbooks. Helping students to establish intuition is helpful to understand those abstract concepts and then understand the essence of mathematics. On the other hand, if we blindly pay attention to the strictness of form, students will become slaves to boring rules like rats forced to perform the fire drill.
For the intuitive problem of linear algebra mentioned above, I have repeatedly thought about it for four or five times in more than two years, so I have read several books on linear algebra, numerical analysis, algebra and mathematics at home and abroad, such as the famous works of the former Soviet Union, Mathematics: Its Content, Method and Significance, Five Lectures on Professor Gong Sheng's Linear Algebra and the aforementioned Mathematical Encounter. But even so, my understanding of this subject has experienced several times of self-denial. For example, some conclusions I thought before were written in my blog, but now it seems that these conclusions are basically wrong. So I'm going to record my current understanding completely. On the one hand, I think my current understanding is more mature, so I can take it out and discuss it with others, or ask others for advice. On the other hand, if there is further understanding in the future and the current understanding is overturned, then the snapshot written now is also very meaningful.
Because I plan to write more, I will write slowly in several times. I don't know if I have time to finish it slowly, and if it will be interrupted, just write it.
-
Today, let's talk about the understanding of several core concepts of linear space and matrix. Most of these things are written according to their own understanding, and they are basically not copied. There may be mistakes, I hope I can point them out. But I hope it is intuitive, that is, I can tell the substantive problems behind mathematics.
First of all, talk about space. This concept is one of the lifeblood of modern mathematics. Starting from the topological space, adding definitions step by step can form many spaces. In fact, linear space is still relatively primary. If a norm is defined in it, it becomes a normed linear space. Normalized linear space satisfies completeness and becomes Barnaher space; Define an angle in normed linear space, derive an inner product space, and then the inner product space satisfies completeness, and then get Hilbert space.
In a word, there are many kinds of space. If you look at the mathematical definition of a space, it is generally "there is a set, define the concept of so-and-so on this set, and then satisfy some properties", you can call it a space. This is a little strange. Why use "space" to refer to some such collections? As you will see, this is actually very reasonable.
The space that most people are familiar with is undoubtedly the three-dimensional space we live in (according to Newton's concept of absolute time and space). Mathematically, this is a three-dimensional Euclidean space. Let's see, what is the most basic feature of such a familiar space? If you think about it carefully, you will know that this three-dimensional space: 1 Consists of many (actually infinite) position points; 2. There is a relative relationship between these points; 3. Length and angle can be defined in space; This space can accommodate movement. What we are talking about here is the movement (transformation) from one point to another, not the "continuous" movement in the sense of calculus.
Among the above attributes, the most critical one is the fourth. 1 and 2 can only be said to be the basis of space, not the unique nature of space. When discussing mathematical problems, there must be a set, and most of them have to define some structures (relationships) on this set, which does not mean that having these is space. The third item is so special that no other spaces are needed, let alone the key nature. Only the fourth article is the essence of space, that is to say, accommodating movement is the essential feature of space.
Recognizing this, we can extend our understanding of three-dimensional space to other spaces. In fact, no matter what space it is, it must accommodate and support the movement (transformation) that conforms to the rules. You will find that there is often a corresponding transformation in a certain space, such as topological transformation in topological space, linear transformation in linear space and affine transformation in affine space. In fact, these transformations are only allowed forms of motion in the corresponding space.
Therefore, as long as we know, "space" is a collection of objects containing motion, and transformation specifies the motion of the corresponding space.
Let's look at linear space. The definition of linear space is found in any book, but since linear space is recognized as space, there are two basic problems that must be solved first, that is:
1. space is an object set, and linear space is also a space, so it is also an object set. So what kind of object is linear space? Or, do objects in linear space have anything in common?
2. How to express the motion in linear space? That is, how is linear transformation expressed?
Answer the first question first. In fact, you don't have to beat around the bush to answer this question, just give the answer directly. Any object in a linear space can be represented as a vector by selecting the basis and coordinates. I won't talk about the general vector space. Let me give two less common examples:
L 1。 All polynomials with the highest degree not greater than n degrees form a linear space, that is, every object in this linear space is a polynomial. If we take x0, x 1, ..., xn as bases, then any such polynomial can be expressed as a set of n+ 1 dimensional vectors, in which each component ai is actually the coefficient of the x(i- 1) term in the polynomial. It is worth noting that there are many ways to select the basis, as long as the selected group is not related to the baseline. This will use the concept mentioned later, so I won't say it here, just mention it.
L2。 All continuously differentiable functions of order n on the closed interval [a, b] form a linear space. In other words, every object in this linear space is a continuous function. For any of these continuous functions, according to the Wilstrass theorem, we can certainly find a polynomial function with the highest degree not greater than n, so that the difference between it and the continuous function is 0, that is, it is completely equal. This comes down to L 1 There is no need to repeat it later.
Therefore, vectors are very powerful. As long as a suitable basis is found, any object in linear space can be represented by a vector. Here is a great article, because the vector is just a string of numbers on the surface, but in fact, because of its order, it can carry the information of the corresponding position of each number in addition to the information carried by the numbers themselves. Why is array the simplest but powerful in programming? This is the root cause. This is another question, so I won't say it here.
Let's answer the second question. The answer to this question will involve one of the most basic problems of linear algebra.
Motion in linear space is called linear transformation. In other words, you can move from one point in linear space to any other point through a linear change. So, how to express linear transformation? Interestingly, in linear space, when you choose a set of bases, you can not only describe any object in space with a vector, but also describe any motion (transformation) in space with a matrix. The way to make an object move correspondingly is to multiply the matrix representing the movement by the vector representing the object.
In short, after selecting the basis in linear space, the vector describes the object, the matrix describes the motion of the object, and the motion is applied by multiplying the matrix and the vector.
Yes, the essence of matrix is the description of motion. If someone asks you what a matrix is in the future, you can tell him loudly that the essence of a matrix is the description of motion. (chensh, say you! )
But how interesting, can't the vector itself be regarded as a matrix of n x 1 It is really amazing that objects and movements in a space can be expressed in a similar way. Can you say this is a coincidence? If it is a coincidence, it is really a lucky coincidence! It can be said that most wonderful properties in linear algebra are directly related to this coincidence.
In the last article, it was said that "matrix is a description of motion". So far, it seems that everyone has no opinion. But I believe that sooner or later, some netizens from the Department of Mathematics will make the final decision. Because the concept of movement is associated with calculus in mathematical physics. When we study calculus, someone will always tell you according to the script that elementary mathematics is mathematics that studies constants and static mathematics, and advanced mathematics is mathematics that studies variables and motion. Word of mouth, almost everyone knows this sentence. But it seems that not many people really know what this sentence means. In short, in our human experience, exercise is a continuous process. From point A to point B, even the fastest light needs a time to pass through the path between AB point by point, which brings the concept of continuity. And continuous things, if you don't define the concept of limit, can't explain at all. The ancient Greeks were very strong in mathematics, but they lacked the concept of limit and could not explain the movement. They were killed by Zhi Nuo's famous paradox (the arrow does not move, Achilles, Scud, but the tortoise and so on. Since this article is not about calculus, I won't say more. Interested readers can look at Revisiting Calculus written by Professor Qi. Only after reading the beginning of this book can I understand the truth of the sentence "advanced mathematics is mathematics for studying sports"
But in my article Understanding Matrix, the concept of "movement" is not a continuous movement in calculus, but an instantaneous change. For example, at the moment of point A, after a "movement", it suddenly "jumps" to point B without passing through any point between point A and point B. This kind of "movement", or "jump", is against our daily experience. However, people who know a little about quantum physics will immediately point out that quantum (such as electrons) jump in different energy level orbits, which happens instantaneously and has such a transition behavior. Therefore, it is not without this movement phenomenon in nature, but we can't observe it macroscopically. But in any case, the word "sports" is easy to be ambiguous when used here. More accurately, it should be "transition". So this sentence can be changed to:
Matrix is the description of transition in linear space.
But this is too physical, that is, too specific, not mathematical, that is, not abstract enough. Therefore, we finally use a real mathematical term-transformation to describe this matter. Having said that, we should understand that the so-called transformation is actually the transition from one point (element/object) to another point (element/object) in space. For example, a topological transformation is a transformation from one point to another in a topological space. For another example, affine transformation is the transition from one point to another in affine space. By the way, this affine space and vector space are brothers. Friends who do computer graphics know that all computer graphics transformation matrices are 4x4, although only three-dimensional vectors are needed to describe a three-dimensional object. The reason is that many books say "for convenience", which in my opinion is simply an attempt to muddle through. The real reason is that the graphic transformation applied in computer graphics is actually carried out in affine space rather than vector space. Think about it. In vector space, a vector is still the same after moving in parallel, but in the real world, two parallel lines with the same length cannot be regarded as the same thing, so the living space of computer graphics is actually an affine space. The matrix representation of affine transformation is basically 4×4. Once again, interested readers can look at the detailed explanation of computer graphics-geometry tool algorithm.
Once we understand the concept of "transformation", the definition of matrix becomes:
"A matrix is a description of a transformation in a linear space."
So far, we have finally got a definition that looks more mathematical. But I have to say a few more words. Textbooks generally say that a linear transformation T in linear space V can be expressed as a matrix when a set of bases is selected. So we need to figure out what is linear transformation, what is basis, and what is selecting a set of basis. The definition of linear transformation is simple. There is a transformation t, so for any two different objects X and Y and any real numbers A and B in the middle of linear space V, there are:
T(ax + by) = aT(x) + bT(y),
Then t is called a linear transformation.
The definition is written like this, but we can't get an intuitive understanding just by looking at the definition. What kind of transformation is linear transformation? Just now, we said that transformation is the transition from one point to another in space, while linear transformation is the movement from one point in one linear space V to another point in another linear space W. This sentence contains a layer of meaning, that is, a point can be transformed not only to another point in the same linear space, but also to another point in another linear space. No matter how you transform, as long as there are objects in linear space before and after the transformation, then the transformation must be linear and can be described by a nonsingular matrix. The transformation described by nonsingular matrix must be linear transformation. Some people may ask, why should we emphasize nonsingular matrices here? The so-called nonsingularity is only meaningful for square matrices, but what about non-square matrices? This will be quite a long time. Finally, we should take linear transformation as a kind of mapping, discuss its mapping properties, and the concepts of kernel and image of linear transformation to make it completely clear. I don't think this is the point. If you really have time, I will write it later. Below we only discuss the most commonly used and useful transformation, that is, the linear transformation in the same linear space. That is to say, the matrix mentioned below, if not specified, is a square matrix, and it is a nonsingular square matrix. When learning a subject, the most important thing is to grasp the main content and quickly establish the overall concept of the subject, without having to consider all the details and special circumstances at the beginning.
Then go on, what is a base? This problem will be discussed later. Here we only need to regard the basement as a coordinate system in linear space. Note that it is a coordinate system, not a coordinate value, but a "unity of opposites and contradictions". In this way, "selecting a set of bases" means selecting a coordinate system in linear space. That's what I'm saying.
Ok, finally, we improve the definition of matrix as follows:
"Matrix is a description of linear transformation in linear space. In linear space, as long as we select a set of bases, any linear transformation can be described by a definite matrix. "
The key to understanding this sentence is to distinguish between "linear transformation" and "a description of linear transformation" One is the object, and the other is the expression of the object. Just like the familiar object-oriented programming, an object can have multiple references, and each reference can be called with a different name, but all refer to the same object. If it's not vivid, make a vulgar analogy.
For example, there is a pig, and you are going to take a picture of it. As long as you choose a lens position for the camera, you can take pictures of pigs. This photo can be regarded as a description of the pig, but it is only a one-sided description, because taking pictures of the pig in different lens positions can get different photos, which is another one-sided description of the pig. The photos taken in this way are all descriptions of the same pig, but they are not the pig itself.
Similarly, for linear transformation, as long as a set of bases is selected, a matrix can be found to describe linear transformation. Change a set of bases and you will get a different matrix. All these matrices are descriptions of the same linear transformation, but they are not the linear transformation itself.
But in this case, the question is, if you give me two photos of pigs, how do I know that the two photos are the same pig? Similarly, if you give me two matrices, how do I know that these two matrices describe the same linear transformation? If it is a different matrix description of the same linear transformation, it is our brother. It's a joke not to know each other when you meet.
Fortunately, we can find a property of the matrix brothers of the same linear transformation, that is:
If matrices A and B are two different descriptions of the same linear transformation (they are different because different bases are selected, that is, different coordinate systems are selected), then a nonsingular matrix P can be found, so that the relationship between A and B can be satisfied:
A = P- 1BP
Readers who are familiar with linear algebra can see at a glance that this is the definition of similar matrix. Yes, the so-called similarity matrix is the different description matrix of the same linear transformation. According to this definition, photos of the same pig from different angles can also be similar photos. It's a bit tacky, but it's understandable.
In the above formula, matrix P is actually a transformation relationship between the base on which matrix A is based and the base on which matrix B is based. This conclusion can be proved in a very intuitive way (rather than in the form of ordinary textbooks). If I have time, I will add this proof to my blog in the future.
This discovery is too important. It turns out that a family of similar matrices are all descriptions of the same linear transformation! No wonder it is so important! There are matrix theory, matrix analysis and other courses in engineering graduate courses, in which various similar transformations are needed, such as what is similar to standard form and diagonalization. The matrix obtained after transformation is similar to the previous matrix. Why do you need to do this? Because only in this way can we ensure that the two matrices before and after the transformation describe the same linear transformation. Of course, different matrix descriptions of the same linear transformation are not different from good rings in terms of practical operation properties. Some description matrices are much better than others. Very easy to understand. The photos of the same pig are both beautiful and ugly. So the similar transformation of matrices can turn an ugly matrix into a beautiful matrix, and ensure that the two matrices describe the same linear transformation.
In this way, the matrix, as one side of linear transformation description, is basically clear. However, things are not so simple, or linear algebra has a more wonderful property than this, that is, the matrix can be used not only as a description of linear transformation, but also as a description of a set of bases. As a transformation matrix, not only one point in linear space can be transformed into another point, but also one coordinate system (base) table in linear space can be transformed into another coordinate system (base). Moreover, the function of transformation point and transformation coordinate system is the same. This is the most interesting mystery of linear algebra. Understanding these contents, a lot in linear algebra.