In this section we begin the material from Chapters 3 and 4
of `Artin`.

We have defined the notion of a binary operation on a set
*X*; this is a function from to *X*. There is
another kind of operation that occurs frequently, where we
combine elements from sets *X* and *Y* and obtain an
element of *Y*. The classic example, which is the model
for our definition below, is the multiplication of a
vector by a scalar. In this case, we take a number
and a vector , and we obtain a new
vector . An operation of this sort
is a function . `Artin` calls this an
*external law of composition * . We will call it an
*operation of X on Y * or an

Let *R* be a ring and let (*M*,+) be an *Abelian
group* . We say *M* is a (left) *module * over *R*
(or an *R*-*module* ) if there is an operation of *R*
on *M* satisfying the following conditions for all , . The first condition is an associative law,
while the final two conditions are distributive laws.

- 1.
*a*.(*b*.*m*)=(*ab*).*m*;- 2.
- 1
_{R}.*m*=*m*; - 3.
- (
*a*+*b*).*m*=*a*.*m*+*b*.*m*; - 4.
*a*.(*m*+*n*)=*a*.*m*+*a*.*n*.

We will give some general examples of modules for rings
which are not necessarily fields, but after that we will
only consider vector spaces. When *R* is not a field, the
theory of modules is quite a bit more complicated.

**Example 3140**

- 1.
- Let and let
*M*be an Abelian group. Then there is a unique way to make*M*into a -module. To see this, note that the condition that 1*m*=*m*for all together with distributivity implies*nm*equals the sum of*m*with itself*n*times for any positive integer*n*. We will shortly see that (-*r*)*m*=-(*rm*) always holds, and so it follows that there is only one way to define*nm*that might make*M*into a -module. This definition does make*M*into a -module: in fact, the axioms (1)-(4) above are just the laws of exponents for*M*written in additive form. - 2.
- Let
*R*be any ring. Then*R*^{n}, whether regarded as the set of column vectors, the set of row vectors, or the set of*n*-tuples from*R*, is an*R*-module with componentwise operations. Thus is just the product of the scalar*r*and the matrix that we used in Section 1 on matrices. The properties required for*M*to be an*R*-module all follow from the corresponding properties in*R*. - 3.
- The preceding example is a special case of the following.
Given any positive integers
*m*,*n*, the set of all matrices over*R*is an*R*-module with the usual definition of scalar multiplication:*r*.*A*=*rA*. - 4.
- Let be rings of numbers. Then
*S*is an*R*-module when we use the multiplication in*S*, that is,*r*.*s*=*rs*.

Let's record some expected, and easily proven, properties of modules.

**Lemma 3147**

Let *M* be an *R*-module. Then for all ,we have the following.

- 1.
- 0
_{R}*m*=0_{M}. - 2.
*r*0_{M}=0_{M}.- 3.
- (-1)
*m*=-*m*.

**Lemma 3169**

Let *F* be a field and let *V* be a vector space over *F*.

- 1.
- If and
*av*=0, then*a*=0 or*v*=0. - 2.
- If and , then
*av*=*bv*implies*a*=*b*. - 3.
- If and , then
*av*=*aw*implies*v*=*w*.

(2) and (3) are corollaries of (1) -- Exercise.

We will be interested in a much more general kind of
cancellation. We say are
*linearly independent * if whenever for , we have *a*_{i}=0 for
each . We say are
*linearly dependent * if they are not linearly
independent, that is, if there exist ,not all of which are , such that
.

Lemma 7.3 says that any single
nonzero is linearly independent. If we consider
the ordinary plane , we see that two vectors *v*,*w*
are linearly independent iff they do not lie on the same
line through . However, any three vectors
in are linearly dependent. The reason for this
last claim is that if two vectors *v*,*w* in are
independent, then *any* vector can be
written as *u*=*av*+*bw* for some , whence
*av*+*bw*+(-1)*u*=0. Thus the condition of linear
independence is related to another condition, that of
expressing other vectors as a linear combination of the
given vectors. Our goal in this section is to explore
this link. This will lead us to the notion of a basis and
the dimension of a vector space.

Before we give the necessary formal definitions, let us
consider the ideas discussed in the last paragraph. When
we deal with ordinary *n*-space , it is usually
crucial for us to know we have a set of co-ordinate axes.
For example, in three dimensions, we express points,
functions, and so on, in terms of the co-ordinates
(*x*,*y*,*z*), which in turn are defined in terms of the
*x*,*y*,*z*-axes. Every point has a unique set of co-ordinates
relative to these axes. If are the normal
unit vectors, then the point (*x*,*y*,*z*) corresponds to the
vector .

A basis of a vector space *V* can be thought of as the
same thing -- a set of vectors that define co-ordinate
axes, such that every vector can be written as a unique
combination of the basis vectors.

If are elements of the vector space *V*
over the field *F*, a *linear combination * of
is a sum of the form for some . We will be ambiguous here:
the term linear combination can refer either to the sum or
to the vector that is the result of that sum. Thus we
will refer to as the *coefficients *
in the linear combination, even though different
coefficients could yield the same vector. We leave it to
the reader to make our ambiguity clear in any given
situation.

A *trivial linear combination * is one in which
every coefficient is . We say are
*linearly independent * if the only linear
combination yielding is the trivial one. (This is the
same as the definition given earlier in this section.) We
say *span V * if every element of

Thus a basis is a set of elements of *V* that can serve as
a set of co-ordinate axes.

**Example 3244**

- 1.
- The quintessential example of a basis is the set of vectors
in
*F*^{n}. - 2.
- More generally, the matrix units
*E*_{ij}, , form a basis for the vector space of matrices over*F*. - 3.
- Let and . Then
*V*is a vector space over*F*, and every element of*V*can be written uniquely in the form*a*+*bi*=*a*1+*bi*for some . This says precisely that 1,*i*is a basis for*V*over*F*. It is not the only basis: there are uncountably many bases. Another example of a basis is 2+3*i*,5-7*i*.

**Lemma 3249**

Let . Then form a
basis for *V* if and only if they are linearly independent
and span *V*.

Conversely, if span *V*, any can
be written for some scalars
. If also , then
. Hence if are linearly independent, we conclude that *a*_{i}=*b*_{i} for
each *i*. This proves form a basis.

**Remark 3287** **(2) **

Above we have applied the terms *linearly
independent, span, basis* to a group of objects
and so we have used plural language
(``are'', ``span'', ``form''). We frequently think of
as the set , in which
case we use the singular: *X* is linearly independent, *X*
spans *V*, *X* is a basis. In what follows we will mix
these modes of usage.

There is still another way in which we regard a basis. If
our goal is to put a co-ordinate system on *V*, then we
will presumably associate the *n*-tuple to the vector . This implies an
ordering. Thus when we wish to use explicit co-ordinates,
we need to speak of an *ordered basis * , which is an
*n*-tuple . As usual, we generally leave
it to the reader to figure out what we are talking about
at any given moment!

Once we decide to apply the terms ``linearly independent'',
``span'', and ``basis'' to sets, it becomes natural to
allow infinite sets, and hence to modify the definitions
slightly. Thus if *X* is a set, a *linear
combination * of elements of *X* is a sum (or its result)
for some finite collection
of distinct elements of *X* and some
. Linear independence, spanning, and
basis are defined solely in terms of such finite linear
combinations.

Our goal for the rest of this section is straightforward.
We wish to show that every vector space has a basis, and
that any two bases have the same number of elements. This
common number will be called the *dimension* of the
vector space.

In pursuit of this goal, it is convenient to introduce other notions, which fortuitously are natural and useful in their own right.

If , we say *W* is a *subspace * of *V* if
it is a subgroup under + (i.e., is closed under + and
- and contains ) and it is closed under scalar
multiplication, i.e., implies .

**Lemma 3307**

Let . Then *W* is a subspace of *V* if and only
if (a) ; (b) If , then ; and (c) If , then .

In general, a subspace of the *n*-dimensional vector space
is a flat or linear space, containing the origin,
of smaller dimension. In fact, let us define to be a *linear subset * if *X* contains the entire
line through any two points of *X*. The subspaces of
are precisely the linear subsets containing the
origin. We will say more about this later.

Let . The *span of X * is defined to be
the set of all linear combinations of finitely many
elements from

**Lemma 3339**

Let *V* be a vector space over *F* and let . Then
is the smallest subspace of *V* containing *X*.

We leave both as exercises.

IfIn terms of our vector analogy, the subspace spanned by a set of vectors is the smallest linear space through the origin containing all the vectors.

The following lemma links linear independence and spanning.

**Lemma 3361**

Let *V* be a vector space over *F*, let be
linearly independent, and let . Then the
following statements are true.

- 1.
- is linearly independent iff .
- 2.
- If , then is linearly dependent iff .

First, suppose . Then we have for some , . Thus is a nontrivial linear combination from that yields , so is linearly dependent.

Next suppose is linearly dependent, i.e.,
suppose that there is a non-trivial linear combination of
elements from that equals . Since *X* is
linearly independent, this combination must involve *y* in
a non-trivial way. That is, there must exist
with such that
. We can solve this equation
for *y* and we find .

We are now ready to pursue our goal of showing bases exist
and the cardinality of a basis of *V* is uniquely
determined by *V*. We begin with one of the key results.

**Theorem 3392**

Let *V* be a vector space over a field *F* and let
. If *X* is linearly independent and *Y* spans
*V*, then there is a subset *Y*' of *Y* such that is a basis of *X*.

Choose a subset *Y*' of *Y* with the following two
properties: (1) is linearly independent, and
(2) *Y*' is the largest subset of *Y* satisfying condition
(1). Note that satisfies condition (1), so
there are subsets of *Y* satisfying (1). Since *Y* is
finite, there is a largest such subset. (It is here where
we have to use more advanced techniques if *Y* is
infinite.)

We claim is a basis of *V*. It is linearly
independent by definition, so we must show *X*' spans *V*.
We will first show that .

Let and suppose . Then and by Lemma 7.9,
is linearly independent. But if we set
, we have and linearly independent. This contradicts our choice of
*Y*'. If follows that .

Now is a subspace containing *Y*, so by
Lemma 7.8, contains the
subspace . Thus *X*' spans *V*, and we have
proven that *X*' is a basis of *V*.

**Corollary 3429**

Let *V* be a vector space. Then any linearly independent
subset of *V* can be expanded to a basis, and any subset
that spans *V* can be contracted to a basis.

If *Y* spans *V*, we can take and apply
Theorem 7.10.

**Remark 3437** **(2) **

There is one problem with the proof of the preceding
corollary, and hence with the proof of the next corollary.
In the proof, we used the set *V* as a spanning set, but
*V* is likely to be infinite. Thus we are forced to
confront the ``infinite case'' we tried to avoid. One way
to avoid this problem is to prove all results only for
*finitely spanned * vector spaces, that is, vector
spaces which have a finite spanning set. (Such vector
spaces are precisely the *finite dimensional vector
spaces * .) The reader may either make this restriction or
read the appendix to this section, where the ``infinite''
problem is discussed.

**Corollary 3440**

Every vector space has a basis.

**Problem 3459**

Let *X* be a subset of a vector space *V*. Show that the
following statements are equivalent.

- 1.
*X*is a basis of*V*.- 2.
*X*is a maximal linearly independent subset of*V*. (That is,*X*is linearly independent and if , then*Y*is linearly dependent.)- 3.
*X*is a minimal spanning set in*V*. (That is,*X*spans*V*and if , then*Y*does not span*V*.)

Our next goal is to compare the sizes of bases. This requires a lemma, which is related to Theorem 7.10, but is not quite the same.

**Lemma 3480** **(Exchange Lemma) **

Let *V* be a vector space, let be a linearly
independent set, and let span *V*.

- 1.
- Either or there are , such that if we create sets
*X*' and*Y*' by exchanging*x*and*y*, that is and , then*X*' is linearly independent and*Y*' spans*V*. - 2.
- If
*X*,*Y*are bases, then either*X*=*Y*or we can choose*x*,*y*as in (1) such that*X*',*Y*' are bases.

The linear independence of *X*' is all that we will
actually need below. However, we claimed we could choose
*y* so that *Y*' spans *V*; to do this, we must be a
little more careful.

We can still take any . Since *Y*
spans *V*, we can write for some
and some nonzero . Since *X* is linearly independent, there must be at
least one *y*_{j} such that . Set
*y*=*y*_{j} for this *j*, and

Then *X*' is linearly independent as above. We can write
, so . Obviously for any other , we have . It follows (as in the proof of
Theorem 7.10) that *Y*' spans
*V*.

(2) Exercise.

**Theorem 3518**

Let *X*,*Y* be subsets of a vector space *V* and suppose
that *X* is linearly independent and *Y* spans *V*. Then
.

Suppose the theorem is false and that *V*,*X*,*Y* give us a
counterexample. Keeping *Y*,*V* fixed and *changing
X if necessary* , we can assume that is as
large as possible, that is, if is such that

If , then by the Exchange Lemma, there are
, such that
is linearly independent.
Moreover, is strictly larger
than . This contradicts our choice of *X*.

Thus we must have and so we can conclude that .

**Corollary 3543**

Any two bases of a vector space have the same number of
elements. That is, if *V* is a vector space over a field
*F* and *X*,*Y* are bases of *V*, then |*X*|=|*Y*|.

This result can also be proved directly using part (2) of the Exchange Lemma.

We define the *dimension * of a vector space *V* to
be the size of any (and hence every) basis of *V*, and we
denote it .

Thus for example, and . We have . (If there are
different fields in use, we sometimes write to
make clear that *V* is a vector space over *F*.)

**Corollary 3563**

Let *V* be a vector space over a field *F* and let be finite.

- 1.
- Any linearly independent subset of
*V*with*n*elements is a basis. - 2.
- Any subset of
*n*elements that spans*V*is a basis.

(2) This proof is similar to the proof of (1).

Note that this proof would fail if were infinite, and in that case, the corollary is not true.

This last result is very useful in deciding whether a given set is a basis. For example, we knowThe following result is another very useful application.

**Corollary 3590**

Let *F* be a field and let . Then the
following conditions are equivalent.

- 1.
*A*is invertible.- 2.
- The columns of
*A*form a basis for*F*^{n}. - 3.
- The columns of
*A*are linearly independent. - 4.
- The columns of
*A*span*F*^{n}. - 5.
- The rows of
*A*form a basis for*F*^{n}. - 6.
- The rows of
*A*are linearly independent. - 7.
- The rows of
*A*span*F*^{n}.

Recall from Section 1 that (1) holds if and only if the
equation can be solved for any . If
and if
*A*_{i} is column *i* of *A*, then . Thus is a linear combination of the
columns of *A*, and so the statement that can
always be solved is equivalent to the statement that the
columns of *A* span *F*^{n}. This proves (1) is equivalent
to (4).

A similar proof shows (1) is equivalent to (7), and this completes the proof.

Here is another nice application of bases. In `Artin`, this
result is used instead of the Exchange Lemma to
*prove* Theorem 7.15. We get it as a corollary.

**Corollary 3615**

A homogeneous system of *m* linear equations in *n*
unknowns always has a nonzero solution if *n*>*m*.

Put in matrix terms, if *A* is an matrix with
*n*>*m*, then there is a with but .

**APPENDIX: Infinite-dimensional Vector Spaces**

At two points in this section we made the assumption that spanning sets or bases were finite. In this appendix we will briefly discuss the general case.

The first place where the finiteness assumption was used
was in the proof of Theorem 7.10. We had a linearly independent set and a
spanning set , and we needed the existence of a
largest subset *Y*' of *Y* such that remained
linearly independent. In the proof what we needed for
``largest'' was that if , then is linearly dependent. We usually express this by
saying that is *maximal with respect to
the property * that is linearly dependent. When
*Y* is finite, we know such maximal sets exist because we
can take a subset *Y*' satisfying this property that has
as many elements as possible. When *Y* is infinite,
however, there will be larger and larger subsets in a
never-ending chain.

Instead, we have to appeal to a fundamental principle of ``infinite'' mathematics, Zorn's Lemma. This lemma asserts the existence of objects without giving any means of constructing them, and so it is viewed with disfavor by some. If one is willing to use it, however, it is extremely powerful. (Indeed, many results cannot be proven without Zorn's Lemma.) We will state it below but not prove it. The proof involves the Axiom of Choice and some form of transfinite induction -- in fact, Zorn's Lemma is equivalent to the Axiom of Choice.

We first state a special version for sets, and then discuss the more general form.

**Lemma ** **(1) **

Let be a non-empty collection of sets. A non-empty
subset of is said to be a *chain * if
for any , either or .Suppose that whenever is a chain in , the union
is an element of . Then contains a maximal element.

Let be a chain in , and put . Clearly ; we must show is linearly
independent. Suppose not: then there are
, such that some
non-trivial linear combination of all these elements is
. Each for some . Since
is a chain, there is an index *j*, ,such that for all . The
elements all lie in , and since , these elements must be
linearly independent. This contradicts the choice of these
elements and shows that must be linearly
independent after all. Thus , as required.

The only properties of sets that occur in Zorn's Lemma above are their properties relative to the partial order . Thus it is reasonable to try to formulate Zorn's Lemma in a more general setting, and it turns out that it is both true and useful.

Let be a partial order on a set *S*. We say is a *chain * if *C* is linearly ordered under
, that is, if for any , either or
. We say is an *upper bound * for
if for every . Finally, we say
is a *maximal element * (or simply
*maximal* ) if for implies *s*=*x*.
(Note that we do not require for all . We
are not assuming *S* is totally ordered. In particular,
*S* may have many maximal elements -- or it may have none
at all.)

If we take for *S* our collection of sets and we
take for the inclusion relation , then a chain
in relative to is precisely a chain in in the sense we defined above. Moreover, is an upper bound for any . Our hypothesis
in the special Zorn's Lemma for sets was that this union is
in for any chain , and hence every chain in
has an upper bound in . This is a special case
of the general form of Zorn's Lemma.

**Lemma ** **(General Zorn's Lemma) **

Let be a partial order on a non-empty set *S* and
suppose that every chain in *S* has an upper bound. Then
*S* has a maximal element.

By the special version of Zorn's Lemma for sets, it follows
that contains a maximal element *C*, that is, a
chain *C* that cannot be added onto. By our hypothesis for
the general Zorn's Lemma, this chain *C* has an upper bound
*x*. We claim *x* is a maximal element in *S*.

If this is false, there is an element with
*x*<*y*. But then *s*<*y* for every , so is a chain that properly contains *C*. This is
impossible, and so *x* must be maximal.

The other place we appealed to finiteness was in our proof
of Theorem 7.15. We
were given where *X* is linearly independent
and *Y* spans *V* and we wished to show . We
showed this under the hypothesis that *Y* is finite, using
the Exchange Lemma, by fixing *V*,*Y* and taking a
counterexample *V*,*X*,*Y* with maximal.

By Zorn's Lemma (verify!) we can find a maximal *X* such
that *V*,*X*,*Y* is a counterexample, but can we find one with
maximal? It is not clear how to order our
counterexamples *X* so that is maximized. For
example, it seems quite possible to have counterexamples
*X*,*X*' with but (if
there are any counterexamples at all!).

We can instead solve this particular problem by a counting
argument. We need two facts about infinite sets. If *S*
is a set, let denote the set of all *finite*
subsets of *S*. The first fact we need is that if *S* is
infinite, then .

Suppose . The second fact we need is that if
*S* is infinite and |*S*|>|*T*|, then there exists a such that is
infinite.

Both of these facts are consequences of the fact that for
non-empty sets *A*,*B*, if at least one of them is infinite,
then . (If are
infinite cardinals, then
.) This
implies, for example, that if is an infinite
cardinal, a countable union of sets of cardinality has cardinality .

Assume these two facts and assume *Y* is infinite. Since
*Y* spans *V*, for every , there is a finite subset
such that for some scalars
. For each *x*, pick a particular set *Z* -- or
shrink *Y* to a basis in which case *Z* is unique if we
assume each -- and define a function by letting *f*(*x*) be the chosen set *Z*.

Since *Y* is infinite, our first fact above tell us
. If |*X*|>|*Y*|, our second fact tells us
there is a finite and an infinite with *f*(*x*)=*Z* for all . Thus each element of
*X*' lies in the finite dimensional subspace of *V*
spanned by *Z*. Since , we know *X*' is
linearly independent. Thus we have an infinite linearly
independent set contained in a vector space of finite
dimension. We proved this is impossible in the finite
case of Theorem 7.15.