In this section we begin the material from Chapters 3 and 4 of Artin.
We have defined the notion of a binary operation on a set X; this is a function from to X. There is another kind of operation that occurs frequently, where we combine elements from sets X and Y and obtain an element of Y. The classic example, which is the model for our definition below, is the multiplication of a vector by a scalar. In this case, we take a number and a vector , and we obtain a new vector . An operation of this sort is a function . Artin calls this an external law of composition . We will call it an operation of X on Y or an action of X on Y , and we will adopt the convention of writing the result of applying the function to (x,y) as either x.y or simply xy.
Let R be a ring and let (M,+) be an Abelian group . We say M is a (left) module over R (or an R-module ) if there is an operation of R on M satisfying the following conditions for all , . The first condition is an associative law, while the final two conditions are distributive laws.
We will give some general examples of modules for rings which are not necessarily fields, but after that we will only consider vector spaces. When R is not a field, the theory of modules is quite a bit more complicated.
Let's record some expected, and easily proven, properties of modules.
Let M be an R-module. Then for all ,we have the following.
Let F be a field and let V be a vector space over F.
(2) and (3) are corollaries of (1) -- Exercise.
We will be interested in a much more general kind of cancellation. We say are linearly independent if whenever for , we have ai=0 for each . We say are linearly dependent if they are not linearly independent, that is, if there exist ,not all of which are , such that .
Lemma 7.3 says that any single nonzero is linearly independent. If we consider the ordinary plane , we see that two vectors v,w are linearly independent iff they do not lie on the same line through . However, any three vectors in are linearly dependent. The reason for this last claim is that if two vectors v,w in are independent, then any vector can be written as u=av+bw for some , whence av+bw+(-1)u=0. Thus the condition of linear independence is related to another condition, that of expressing other vectors as a linear combination of the given vectors. Our goal in this section is to explore this link. This will lead us to the notion of a basis and the dimension of a vector space.
Before we give the necessary formal definitions, let us consider the ideas discussed in the last paragraph. When we deal with ordinary n-space , it is usually crucial for us to know we have a set of co-ordinate axes. For example, in three dimensions, we express points, functions, and so on, in terms of the co-ordinates (x,y,z), which in turn are defined in terms of the x,y,z-axes. Every point has a unique set of co-ordinates relative to these axes. If are the normal unit vectors, then the point (x,y,z) corresponds to the vector .
A basis of a vector space V can be thought of as the same thing -- a set of vectors that define co-ordinate axes, such that every vector can be written as a unique combination of the basis vectors.
If are elements of the vector space V over the field F, a linear combination of is a sum of the form for some . We will be ambiguous here: the term linear combination can refer either to the sum or to the vector that is the result of that sum. Thus we will refer to as the coefficients in the linear combination, even though different coefficients could yield the same vector. We leave it to the reader to make our ambiguity clear in any given situation.
A trivial linear combination is one in which every coefficient is . We say are linearly independent if the only linear combination yielding is the trivial one. (This is the same as the definition given earlier in this section.) We say span V if every element of V can be written as a linear combination of . We say form a basis of V if every element of V can be written uniquely as a linear combination of . (This means every for some , and the n-tuple is unique.)
Thus a basis is a set of elements of V that can serve as a set of co-ordinate axes.
Let . Then form a basis for V if and only if they are linearly independent and span V.Proof. form a basis. Clearly they span V. We always have , so if , then ai=0 for all i by uniqueness. Thus are linearly independent.
Conversely, if span V, any can be written for some scalars . If also , then . Hence if are linearly independent, we conclude that ai=bi for each i. This proves form a basis.
Remark 3287 (2)
Above we have applied the terms linearly independent, span, basis to a group of objects and so we have used plural language (``are'', ``span'', ``form''). We frequently think of as the set , in which case we use the singular: X is linearly independent, X spans V, X is a basis. In what follows we will mix these modes of usage.
There is still another way in which we regard a basis. If our goal is to put a co-ordinate system on V, then we will presumably associate the n-tuple to the vector . This implies an ordering. Thus when we wish to use explicit co-ordinates, we need to speak of an ordered basis , which is an n-tuple . As usual, we generally leave it to the reader to figure out what we are talking about at any given moment!
Once we decide to apply the terms ``linearly independent'', ``span'', and ``basis'' to sets, it becomes natural to allow infinite sets, and hence to modify the definitions slightly. Thus if X is a set, a linear combination of elements of X is a sum (or its result) for some finite collection of distinct elements of X and some . Linear independence, spanning, and basis are defined solely in terms of such finite linear combinations.
Our goal for the rest of this section is straightforward. We wish to show that every vector space has a basis, and that any two bases have the same number of elements. This common number will be called the dimension of the vector space.
In pursuit of this goal, it is convenient to introduce other notions, which fortuitously are natural and useful in their own right.
If , we say W is a subspace of V if it is a subgroup under + (i.e., is closed under + and - and contains ) and it is closed under scalar multiplication, i.e., implies .
Let . Then W is a subspace of V if and only if (a) ; (b) If , then ; and (c) If , then .Proof.must be shown is that if , then . We leave this as an exercise. If V is the plane , then the subspaces of V are precisely , V, and all of the lines through the origin. It is obvious that these are subspaces. It is also obvious that if a subspace contains a vector v, then it contains the line through v and the origin. Thus all that remains is to convince yourself that if a subspace contains two vectors that do not line on the same line through the origin, it contains the whole plane.
In general, a subspace of the n-dimensional vector space is a flat or linear space, containing the origin, of smaller dimension. In fact, let us define to be a linear subset if X contains the entire line through any two points of X. The subspaces of are precisely the linear subsets containing the origin. We will say more about this later.
Let . The span of X is defined to be the set of all linear combinations of finitely many elements from X. ThusIn this last equation, we allowed the possibility that n=0. We will make the convention that a sum of terms is 0V. This convention is only necessary to deal with the case , and so our convention amounts to defining .
Let V be a vector space over F and let . Then is the smallest subspace of V containing X.Proof.need to show two things. First, that is a subspace of V (plainly ) and second, that if is a subspace of V, then .
We leave both as exercises.If W is a subspace of V and and ,then we say X spans W.
In terms of our vector analogy, the subspace spanned by a set of vectors is the smallest linear space through the origin containing all the vectors.
The following lemma links linear independence and spanning.
Let V be a vector space over F, let be linearly independent, and let . Then the following statements are true.
First, suppose . Then we have for some , . Thus is a nontrivial linear combination from that yields , so is linearly dependent.
Next suppose is linearly dependent, i.e., suppose that there is a non-trivial linear combination of elements from that equals . Since X is linearly independent, this combination must involve y in a non-trivial way. That is, there must exist with such that . We can solve this equation for y and we find .
We are now ready to pursue our goal of showing bases exist and the cardinality of a basis of V is uniquely determined by V. We begin with one of the key results.
Let V be a vector space over a field F and let . If X is linearly independent and Y spans V, then there is a subset Y' of Y such that is a basis of X.Proof.Y is finite: we will discuss the case where Y is infinite in an appendix to this section.
Choose a subset Y' of Y with the following two properties: (1) is linearly independent, and (2) Y' is the largest subset of Y satisfying condition (1). Note that satisfies condition (1), so there are subsets of Y satisfying (1). Since Y is finite, there is a largest such subset. (It is here where we have to use more advanced techniques if Y is infinite.)
We claim is a basis of V. It is linearly independent by definition, so we must show X' spans V. We will first show that .
Let and suppose . Then and by Lemma 7.9, is linearly independent. But if we set , we have and linearly independent. This contradicts our choice of Y'. If follows that .
Now is a subspace containing Y, so by Lemma 7.8, contains the subspace . Thus X' spans V, and we have proven that X' is a basis of V.
Let V be a vector space. Then any linearly independent subset of V can be expanded to a basis, and any subset that spans V can be contracted to a basis.Proof.X is linearly independent, we can take Y=V and apply Theorem 7.10.
If Y spans V, we can take and apply Theorem 7.10.
Remark 3437 (2)
There is one problem with the proof of the preceding corollary, and hence with the proof of the next corollary. In the proof, we used the set V as a spanning set, but V is likely to be infinite. Thus we are forced to confront the ``infinite case'' we tried to avoid. One way to avoid this problem is to prove all results only for finitely spanned vector spaces, that is, vector spaces which have a finite spanning set. (Such vector spaces are precisely the finite dimensional vector spaces .) The reader may either make this restriction or read the appendix to this section, where the ``infinite'' problem is discussed.
Every vector space has a basis.Proof.Corollary 7.11 either to the linearly independent set or the spanning set V.
Let X be a subset of a vector space V. Show that the following statements are equivalent.
Our next goal is to compare the sizes of bases. This requires a lemma, which is related to Theorem 7.10, but is not quite the same.
Lemma 3480 (Exchange Lemma)
Let V be a vector space, let be a linearly independent set, and let span V.
The linear independence of X' is all that we will actually need below. However, we claimed we could choose y so that Y' spans V; to do this, we must be a little more careful.
We can still take any . Since Y spans V, we can write for some and some nonzero . Since X is linearly independent, there must be at least one yj such that . Set y=yj for this j, and
Then X' is linearly independent as above. We can write , so . Obviously for any other , we have . It follows (as in the proof of Theorem 7.10) that Y' spans V.
Let X,Y be subsets of a vector space V and suppose that X is linearly independent and Y spans V. Then .Proof.we will assume Y is finite, say |Y|=n; we will discuss the infinite case in the appendix to this section.
Suppose the theorem is false and that V,X,Y give us a counterexample. Keeping Y,V fixed and changing X if necessary , we can assume that is as large as possible, that is, if is such that V,X',Y give us a counterexample, then . (This is possible because all the numbers involved are no greater than n=|Y|.)
If , then by the Exchange Lemma, there are , such that is linearly independent. Moreover, is strictly larger than . This contradicts our choice of X.
Thus we must have and so we can conclude that .
Any two bases of a vector space have the same number of elements. That is, if V is a vector space over a field F and X,Y are bases of V, then |X|=|Y|.Proof.Theorem 7.15, we have and . Thus |X|=|Y|.
This result can also be proved directly using part (2) of the Exchange Lemma.
We define the dimension of a vector space V to be the size of any (and hence every) basis of V, and we denote it .
Thus for example, and . We have . (If there are different fields in use, we sometimes write to make clear that V is a vector space over F.)
Let V be a vector space over a field F and let be finite.
(2) This proof is similar to the proof of (1).
Note that this proof would fail if were infinite, and in that case, the corollary is not true.This last result is very useful in deciding whether a given set is a basis. For example, we know F2 has dimension 2, since it has the standard basis . Thus by Corollary 7.17, two vectors form a basis iff they are linearly independent. It is easy to tell when two vectors are linearly independent. We conclude that form a basis iff neither v nor w is a multiple of the other.
The following result is another very useful application.
Let F be a field and let . Then the following conditions are equivalent.
Recall from Section 1 that (1) holds if and only if the equation can be solved for any . If and if Ai is column i of A, then . Thus is a linear combination of the columns of A, and so the statement that can always be solved is equivalent to the statement that the columns of A span Fn. This proves (1) is equivalent to (4).
A similar proof shows (1) is equivalent to (7), and this completes the proof.
Here is another nice application of bases. In Artin, this result is used instead of the Exchange Lemma to prove Theorem 7.15. We get it as a corollary.
A homogeneous system of m linear equations in n unknowns always has a nonzero solution if n>m.
Put in matrix terms, if A is an matrix with n>m, then there is a with but .Proof.in the proof of Corollary 7.18, the product is a linear combination of the columns of A. There are n of these columns, and they are elements of Fm, a vector space of dimension m<n. Thus the set of columns must be linearly dependent, that is, some non-trivial linear combination of them must be . This says for some nonzero .
APPENDIX: Infinite-dimensional Vector Spaces
At two points in this section we made the assumption that spanning sets or bases were finite. In this appendix we will briefly discuss the general case.
The first place where the finiteness assumption was used was in the proof of Theorem 7.10. We had a linearly independent set and a spanning set , and we needed the existence of a largest subset Y' of Y such that remained linearly independent. In the proof what we needed for ``largest'' was that if , then is linearly dependent. We usually express this by saying that is maximal with respect to the property that is linearly dependent. When Y is finite, we know such maximal sets exist because we can take a subset Y' satisfying this property that has as many elements as possible. When Y is infinite, however, there will be larger and larger subsets in a never-ending chain.
Instead, we have to appeal to a fundamental principle of ``infinite'' mathematics, Zorn's Lemma. This lemma asserts the existence of objects without giving any means of constructing them, and so it is viewed with disfavor by some. If one is willing to use it, however, it is extremely powerful. (Indeed, many results cannot be proven without Zorn's Lemma.) We will state it below but not prove it. The proof involves the Axiom of Choice and some form of transfinite induction -- in fact, Zorn's Lemma is equivalent to the Axiom of Choice.
We first state a special version for sets, and then discuss the more general form.
Let be a non-empty collection of sets. A non-empty subset of is said to be a chain if for any , either or .Suppose that whenever is a chain in , the union is an element of . Then contains a maximal element.Proof., you'll have to look this one up. Let us show that this applies to our situation. We are given where X is linearly independent and Y spans V. We let be the collection of all subsets Y' of V such that is linearly independent. We need to show the hypothesis of Zorn's Lemma applies, and then we will be able to conclude that contains a maximal element Y', which is exactly what we want.
Let be a chain in , and put . Clearly ; we must show is linearly independent. Suppose not: then there are , such that some non-trivial linear combination of all these elements is . Each for some . Since is a chain, there is an index j, ,such that for all . The elements all lie in , and since , these elements must be linearly independent. This contradicts the choice of these elements and shows that must be linearly independent after all. Thus , as required.
The only properties of sets that occur in Zorn's Lemma above are their properties relative to the partial order . Thus it is reasonable to try to formulate Zorn's Lemma in a more general setting, and it turns out that it is both true and useful.
Let be a partial order on a set S. We say is a chain if C is linearly ordered under , that is, if for any , either or . We say is an upper bound for if for every . Finally, we say is a maximal element (or simply maximal ) if for implies s=x. (Note that we do not require for all . We are not assuming S is totally ordered. In particular, S may have many maximal elements -- or it may have none at all.)
If we take for S our collection of sets and we take for the inclusion relation , then a chain in relative to is precisely a chain in in the sense we defined above. Moreover, is an upper bound for any . Our hypothesis in the special Zorn's Lemma for sets was that this union is in for any chain , and hence every chain in has an upper bound in . This is a special case of the general form of Zorn's Lemma.
Lemma (General Zorn's Lemma)
Let be a partial order on a non-empty set S and suppose that every chain in S has an upper bound. Then S has a maximal element.Proof.version is potentially far more powerful than the version we gave for sets, but we can prove the general version using the special one. Let be the set of all chains in S. If is a chain in , then is a collection of chains in S that are linearly ordered under . It follows that is a chain in S. (Good exercise. ) Thus .
By the special version of Zorn's Lemma for sets, it follows that contains a maximal element C, that is, a chain C that cannot be added onto. By our hypothesis for the general Zorn's Lemma, this chain C has an upper bound x. We claim x is a maximal element in S.
If this is false, there is an element with x<y. But then s<y for every , so is a chain that properly contains C. This is impossible, and so x must be maximal.
The other place we appealed to finiteness was in our proof of Theorem 7.15. We were given where X is linearly independent and Y spans V and we wished to show . We showed this under the hypothesis that Y is finite, using the Exchange Lemma, by fixing V,Y and taking a counterexample V,X,Y with maximal.
By Zorn's Lemma (verify!) we can find a maximal X such that V,X,Y is a counterexample, but can we find one with maximal? It is not clear how to order our counterexamples X so that is maximized. For example, it seems quite possible to have counterexamples X,X' with but (if there are any counterexamples at all!).
We can instead solve this particular problem by a counting argument. We need two facts about infinite sets. If S is a set, let denote the set of all finite subsets of S. The first fact we need is that if S is infinite, then .
Suppose . The second fact we need is that if S is infinite and |S|>|T|, then there exists a such that is infinite.
Both of these facts are consequences of the fact that for non-empty sets A,B, if at least one of them is infinite, then . (If are infinite cardinals, then .) This implies, for example, that if is an infinite cardinal, a countable union of sets of cardinality has cardinality .
Assume these two facts and assume Y is infinite. Since Y spans V, for every , there is a finite subset such that for some scalars . For each x, pick a particular set Z -- or shrink Y to a basis in which case Z is unique if we assume each -- and define a function by letting f(x) be the chosen set Z.
Since Y is infinite, our first fact above tell us . If |X|>|Y|, our second fact tells us there is a finite and an infinite with f(x)=Z for all . Thus each element of X' lies in the finite dimensional subspace of V spanned by Z. Since , we know X' is linearly independent. Thus we have an infinite linearly independent set contained in a vector space of finite dimension. We proved this is impossible in the finite case of Theorem 7.15.