Learning Calculus I: Definitions of limits

I am learning calculus now I want to type up everything I now about this subject. All solutions written below, if no source is mentioned, are my solutions so if someone find an error in the solutions, please tell me. By the way, thanks to Luca Trevisan’s LaTeX to WordPress, I now don’t have to write LaTeX in WordPress format anymore, which is very convenient.

1.1. Definition of limits of function

For me, the precise definition of limits is a bit hard to understand so I decided to reconstruct the whole thing by writing some of my guesses on why did mathematicians write the definition in this way. By the way, French mathematician, Augustin-Louis Cauchy (1789-1857) is the one who introduced ${\delta, \varepsilon}$ into definition of limits.

Limit, the notation ${\lim_{x \rightarrow a}f(x)=\ell}$ is understood vaguely as: as ${x}$ gets closer to ${a}$, ${f(x)}$ gets closer to a number ${\ell}$. What does it mean by ‘get closer’? It is easier to understand if we imagine ${a}$ lies on a number line and suppose we pick an arbitrary point ${x}$ on the line. If we apply the term ‘${x}$ gets closer to ${a}$‘, we can understand it as distance from ${x}$ to ${a}$ on the number line decreases. Its distance is ${|x-a|}$ so saying ${x}$ gets closer to ${a}$ is the same as saying ${|x-a|}$ gets closer to ${0}$. Note that for any distance ${|x-a|}$, there are two possible ${x}$ on the number line that can satisfy that distance. What we care about is the distance from ${x}$ to ${a}$ not the position of ${x}$, so using ‘${|x-a|}$ gets closer to ${0}$‘ is better than ‘${x}$ gets closer to ${a}$‘. Similarly to ${|f(x)-\ell|}$ and we get a new definition:

Definition 1 The symbol ${\lim_{x \rightarrow a}f(x)=\ell}$ means that as ${|x-a|}$ gets closer to ${0}$ then ${|f(x)-\ell|}$ gets closer to ${0}$.

Note that we can get ${|x-a|}$ as close to ${0}$ as we want by choosing appropriate ${x}$, but we can’t guarantee the same thing for ${|f(x)-\ell|}$ since ${f(x)}$ depends on ${x}$. Thus, firsly, in order for ${|f(x)-\ell|}$ to be able to approach ${0}$ we need ${|f(x)-\ell|}$ to be as small as we want. It is equivalent to saying:

For any ${\varepsilon>0}$ there exists some ${x}$ so ${\varepsilon>|f(x)-\ell|}$.

Secondly, we need the definition to say ${|f(x)-\ell|}$ approaches ${0}$ as ${|x-a|}$ approaches ${0}$. This means for small enough ${\delta}$, ${|f(x)-\ell|}$ must be bounded by some ${\varepsilon}$ for all ${|x-a|<\delta}$. For better understanding more about this argument, let’s assume the contrary, then as ${\delta}$ gets closer to ${0}$, ${|f(x)-\ell|}$ still does not have a bound for all ${|x-a|<\delta}$. Say we find at ${|x-a|=\delta_1<\delta}$ has ${|f(x)-\ell |=M_1}$. We consider all ${x}$ in ${|x-a|<\delta_1}$ and follow that there must be an ${x}$ so ${|x-a|=\delta_2<\delta_1}$ and ${|f(x)-\ell|=M_2>M_1}$ according to the assumption. Next consider all ${x}$ in ${|x-a| < \delta_2}$ and from our assumption, there exists ${x}$ so ${|f(x)-\ell |=M_3>M_2}$ for ${|x-a|=\delta_3<\delta_2}$. This keeps going and we can see that as ${|x-a|}$ gets closer to ${0}$, ${|f(x)-\ell |}$ gets larger, which contradicts to our aim that ${f(x)}$ must approaches ${\ell}$. Thus, we got the second part of the definition:

For small enough ${\delta}$, ${|f(x)-\ell|}$ must be bounded by a number ${\varepsilon>0}$ for all ${|x-a|<\delta}$.

According to first argument, the second argument has to be true for any ${\varepsilon>0}$ and note that in second argument, ${\delta}$ depends on choice of ${\varepsilon}$. Thus, by combining the two we have:

For any ${\varepsilon>0}$, there must exists a ${\delta>0}$ so ${\varepsilon>|f(x)-\ell|}$ for all ${|x-a|<\delta}$.

Thus, we obtain the following definition:

Definition 2 We say the limit of ${f(x)}$ as ${x}$ approaches ${a}$ is ${\ell}$, and we write

$\displaystyle \lim_{x \rightarrow a}f(x)=\ell$

if for every number ${ \varepsilon>0}$ there is a number ${\delta>0}$ such that:

if for all ${|x-a|< \delta}$ then ${|f(x)-\ell| < \varepsilon}$.

To be more clearly, in order for ${|x-a|}$ to be able to approach ${0}$ while ${f(x)}$ is still defined then ${f}$ must be defined on some open interval ${D}$ that contains ${a}$:

Definition 3 Let ${f}$ be a function defined on an open interval ${D}$ that contains ${a}$. We say the limit of ${f(x)}$ as ${x}$ approaches ${a}$ is ${\ell}$, and we write

$\displaystyle \lim_{x \rightarrow a}f(x)=\ell$

if for every number ${ \varepsilon>0}$ there is a number ${\delta>0}$ such that:

For all ${|x-a|< \delta}$ and ${x \in D}$ then ${|f(x)-\ell| < \varepsilon}$.

According to Definition 3 then the limit of function:

$\displaystyle f(x)= \begin{cases} x, \; x \ne 1, \\ 2, \; x=1 \end{cases} \ \ \ \ \ (1)$

at ${x=1}$ will not be defined. Why? Because we can’t find a limit of ${f(x)}$ as ${x}$ goes to ${1}$. From the graph, we find that except at ${x=1}$, all points approach ${(1,1)}$ (or ${f(x)}$ approaches ${1}$) as ${x}$ approaches ${1}$, i.e. for any ${\varepsilon>0}$ there exists ${\delta>0}$ so if ${|x-1|<\delta,x \ne 1}$ then ${|f(x)-1|<\varepsilon}$. But since definition 3 includes point at ${x=1}$ and from the graph, we can clearly see that at ${x=1}$, ${f(x)}$ does not approach ${1}$. In other words, at ${x=1}$ then ${|f(1)-1|=1}$ so we can’t choose ${\varepsilon<1}$ to be true for all ${|x-a|<\delta}$. So the limit is definitely not ${1}$. It also can’t be ${f(1)=2}$ because other points (${x\ne1}$) won’t agree with it. So all we can conclude from definition 3 is that function ${f}$ doesn’t have limit at ${x=1}$. Mathematicians wanted functions such as ${f}$ to have limit at ${x=1}$ so they decided to take out the value of function at ${x=a}$ from definition of limit:

Definition 4 Let ${f}$ be a function defined on an open interval ${D}$ that contains ${a}$, except possibly at ${a}$. We say the limit of ${f(x)}$ as ${x}$ approaches ${a}$ is ${\ell}$, and we write

$\displaystyle \lim_{x \rightarrow a}f(x)=\ell$

if for every number ${ \varepsilon>0}$ there is a number ${\delta>0}$ such that:

if for all ${0<|x-a|< \delta}$ and ${x \in D}$ then ${|f(x)-\ell| < \varepsilon}$.

So with this, limit of ${f}$ at ${x=1}$ is ${1}$. That means we don’t actually care about what happen at ${x=1}$, we only care about what happen near ${x=1}$.

From this definition, we can see that the smaller we choose ${\varepsilon}$, the smaller ${\delta}$. This is equivalent to saying that distance between ${f(x)}$ and ${\ell}$ can be made arbitrarily small if we take distance from ${x}$ to ${a}$ sufficiently small. This agrees with our intuition about limits.

1.2. More definitions of limits

In above definition, ${x}$ can approach ${a}$ from two sides: left of ${a}$ and right of ${a}$. We can define limit analogous if ${x}$ only approach from only one side, simply by changing ${0<|x-a|<\delta}$

Definition 5 (Left-hand limit) Let ${f}$ be a function defined on an open interval ${D}$ that contains ${a}$ except possibly at ${a}$. We say the limit of ${f(x)}$ as ${x}$ approaches ${a}$ from the left is ${\ell}$, and we write

$\displaystyle \lim_{x \rightarrow a^{-} }f(x)=\ell$

if for every number ${\varepsilon>0}$ there is a number ${\delta>0}$ such that:

For all ${a-\delta and ${x \in D}$ then ${|f(x)-\ell | <\varepsilon}$.

Similarly for right-hand limit ${\lim_{x \rightarrow a^+}f(x)=\ell}$. Next, what if ${f(x)}$ apprroaches infinity as ${x}$ approaches ${a}$?

Definition 6 (Infinite limits) Let ${f}$ be a function defined on an open interval ${D}$ that contains ${a}$ except possibly at ${a}$. We say ${f(x)}$ goes to infinity as ${x}$ approaches ${a}$ and write ${\lim_{x \rightarrow a}f(x)=\infty}$ if for any ${M>0}$ there is a number ${\delta>0}$ such that

for all ${0<|x-a|<\delta}$ then ${f(x) > M}$.

We can define similarly for ${\lim_{x \rightarrow a}f(x)=-\infty}$. What about limit as ${x}$ goes to infinity.

Definition 7 (Limits at infinity) Let ${f}$ be a function defined on some open interval ${D}$. We say limit of ${f(x)}$ as ${x}$ approaches infinity is ${\ell}$ and write ${\lim_{x \rightarrow \infty}f(x)=\ell}$ if for any ${\varepsilon}$ there is a ${M>0}$ so that

for all ${x>M}$ and ${x \in D}$ then ${|f(x)-\ell | <\varepsilon}$.

We can also define convergence of sequences in a similar way:

Definition 8 (Convergence of sequences) A sequences ${(x_n)_{n=0}^{\infty}}$ converges to ${\ell}$ as ${n}$ goes to infinity if for every ${\varepsilon>0}$ there exists positive integer ${M}$ so for all ${n \ge M}$ then ${|x_n-\ell|<\varepsilon}$.

1.3. Another way to define limits of function

We can rewrite the definition 3 in terms of limits of sequence.

Definition 9 Function ${f}$ has limit ${\ell}$ at ${a}$ in ${D}$ if for every sequence ${(x_n)_{n=0}^{\infty}}$ which consists entirely of elements of ${D}$ and converges to ${a}$ then we have the sequence ${(f(x_n))_{n=0}^{\infty}}$ converges to ${\ell}$.

We will prove that definitions 3 and 9 are equivalent.

Proof: (3) ${\implies}$ (9): Consider sequence ${(x_n)_{n=0}^{\infty} \subset D}$ which converges to ${a}$. We want to prove that sequence ${(f(x_n))_{n=0}^{\infty}}$ converges to ${\ell}$ i.e. for any ${\varepsilon>0}$, there exists ${M>0}$ so that for all ${n >M}$ then ${|f(x_n)-\ell|<\varepsilon}$. Indeed, from definition 3, for any ${\varepsilon>0}$, there exists ${\delta>0}$ so for all ${|x-a|<\delta}$ then ${|f(x)-\ell | <\varepsilon}$. But since ${(x_n)_{n=0}^{\infty}}$ converges so for such ${\delta>0}$, there exists ${M>0}$ so that for all ${n>M}$ then ${|x_n-a| < \delta}$. Thus, for each ${\varepsilon>0}$, there exists ${\delta>0}$ so ${|f(x)-\ell|<\varepsilon}$ for all ${|x-a|<\delta}$, and for that ${\delta>0}$ there exists ${M>0}$ so ${|x_n-a|< \delta}$ for all ${n>M}$. This follows for each ${\varepsilon>0}$, there exists ${M>0}$ so ${|f(x_n)-\ell |<\varepsilon}$ for all ${n>M}$.

(9) ${\implies}$ (3): It suffices to prove that for any ${\varepsilon>0}$ there is ${\delta>0}$ so ${|f(x)-\ell|<\varepsilon}$ for all ${|x-a| <\delta}$. Indeed, consider all sequences ${(x_n)_{n=0}^{\infty}}$ which consists entirely elements of ${D}$ and converges to ${a}$ then from definition 9, sequences ${(f(x_n))_{n=0}^{\infty}}$ converges to ${\ell}$. Hence, for each sequence ${(f(x_n))_{n=0}^{\infty}}$ there exists a ${M}$ so for all ${n > M}$ then ${|f(x_n)-\ell|< \varepsilon}$. Now we need to use prove existence of ${\delta>0}$ so that this is true for all ${|x-a|<\delta}$, i.e. for each ${(x_n)_{n=0}^{\infty}}$ there exists a ${T}$ so

1. ${|x_n-a|<\delta}$ for each ${n>T}$.
2. ${T> \max \{M | (x_n)_{n=0}^{\infty} \}}$ to ensure that ${|f(x_n)-\ell|<\varepsilon}$ at the same time as ${|x_n-a|<\delta}$ for all sequences ${(x_n)_{n=0}^{\infty}}$.
3. Each ${x}$ in ${(a-\delta,a+\delta)}$ belongs to at least one of the sequence ${(x_n)_{n=0}^{\infty}}$ and equals to ${x_i}$ with ${i>T}$. In other words, ${(a-\delta,a+\delta)}$ is a subset of union of all terms in all sequences ${(x_n)_{n>T}^{\infty}}$ which converge to ${a}$.

Condition 1 can be done according to definition 8 (convergence of sequences). From definition 8, we also follow that if we pick ${\delta}$ smaller and smaller then ${T}$ must get bigger and bigger. Hence, condition 2 can be satisfied. Condition 3, assume a contrary that now matter how small ${\delta>0}$ we choose, we still can’t cover ${(a-\delta,a+\delta)}$ with ${(x_n)_{n>T}^{\infty}}$. This means for each ${\delta>0}$, there still exists a sequence ${(x_n)_{n=0}^{\infty}}$ which has ${x_n}$ with ${|x_n-a|<\delta}$ but ${|f(x_n)-\ell | >\varepsilon}$, which means ${n \le M}$. For each ${\delta}$, we take out such ${x_n=a_n}$. By letting ${\delta}$ smaller and smaller, we’ve create a sequence ${(a_n)_{n=0}^{\infty}}$ which converges to ${a}$ (true according to definition 8) so that ${|f(a_n)-\ell|> \varepsilon}$. This contradicts to our condition of definition 9. Thus, we follow that condition 3 will eventually be true if we take ${\delta>0}$ sufficiently small. Hence, all three conditions can be obtain if we take ${\delta>0}$ sufficiently small. And for such ${\delta>0}$ then ${|f(x)-\ell|< \varepsilon}$ is true for all ${|x-a|<\delta}$. We are done. $\Box$

Are definitions 4 and 9 equivalent to each other? Logically, there is a possibility no matter how large ${M}$ is, there always exists ${n>M}$ so ${x_n=a}$. And if we pick ${f}$ so ${f(a)}$ is really far away from ${\ell}$, then ${(f(x_n))_{n=0}^{\infty}}$ cannot converge to ${\ell}$. Definition 4 ignores value of ${f}$ at ${x=a}$ but definition 8 (limits of sequences) does include value of ${f}$ at ${x=a}$, which follows two things:

1. ${x_n}$ can be equal to ${a}$ for infinitely many ${n}$.
2. If we want to prove ${(f(x_n))_{n=0}^{\infty}}$ converges to ${\ell}$ as ${x_n}$ converges to ${a}$, then we also need to show ${|f(a)-\ell|<\varepsilon}$ for every ${\varepsilon>0}$ (this we assume that ${x_n=a}$ for infinitely many ${n}$).

The second claim can’t be true if we pick such ${f}$ so ${f(a)}$ is far away from ${\ell}$. Hence, we can’t claim ${(f(x_n))_{n=0}^{\infty}}$ converges or diverges if we don’t know any further information of ${f(a)}$. In conclusion, the answer is no, definitions 4 and 9 are not equivalent to each other.

We can take function ${f}$ in (1) as an example to support above argument. According to definition 4, ${\lim_{x \rightarrow 1}f(x)=1}$. Consider arbitrary sequence of ${(x_n)_{n=0}^{\infty}}$ that converges to ${1}$, and then among them, say add more ${x_n=1}$ for infinitely many ${n}$. The new sequence still converges to ${1}$, but we see that ${(f(x_n))_{n=0}^{\infty}}$ does not converge to ${1}$ since there are infinitely many ${n}$ so ${f(x_n)=f(1)=2}$.

1.4. Uniqueness of limits of functions

Theorem 10 (Uniqueness of limits) Let ${f}$ be a function defined on ${D}$ containing ${a}$. If ${M,L \in \text{range}(f)}$ are both limit of ${f}$ at ${a}$, i.e. ${\lim_{x \rightarrow a}f(x)=M}$ and ${\lim_{x \rightarrow a}g(x)=L}$ then ${M=L}$.

Proof: If ${\lim_{x \rightarrow a}f(x)=M}$ and ${\lim_{x \rightarrow b}g(x)=L}$ then that means for all ${\varepsilon>0}$ there exists ${\delta_1,\delta_2>0}$ so ${|f(x)-M|<\varepsilon}$ for all ${0<|x-a|<\delta_1}$ and ${|f(x)-L|<\varepsilon}$ for all ${0<|x-a|<\delta_2}$. Pick ${\delta= \min \{ \delta_1, \delta_2}$ then for all ${|x-a|<\delta_1}$ we have ${|f(x)-M|<\varepsilon}$ and ${|f(x)-L|<\varepsilon}$. Therefore,

$\displaystyle |M-L| \le |f(x)-M|+|L-f(x)|<2\varepsilon.$

If ${M \ne L}$ we can pick ${\varepsilon}$ small enough so ${|M-L|>2\varepsilon}$, which leads to a contradiction. Thus, ${M=L}$. $\Box$

Uniqueness of other limits (left-hand, infinite limits, sequences) can be proved similarly to above.

Posted in Calculus, Mathematics | | Leave a comment

LA I: Identify a list of vectors linearly dependent or linearly independent, a vector space infinite-dimensional or finite-dimensional

Inspired from some good exercises from Chapter 2A in the book Linear Algebra Done Right by Sheldon Axler.

First, let’s write out definition of linearly (in)dependent list:

Definition 1 (Linearly independence) A list of ${v_1, \ldots, v_m}$ of vectors in ${V}$ is called linearly independent if the only choice of ${a_1, \ldots, a_m \in \mathbf{F}}$ so ${a_1v_1+ \ldots+ a_mv_m=0}$ is ${a_1=a_2= \ldots =a_m=0}$.

The definition itself can be applied for linearly (in)dependent test. Also from this, we find

Proposition 2 Every vector in the span of linearly independet list ${v_1, \ldots, v_m}$ can be represented uniquely as ${a_1v_1+ \ldots + a_mv_m}$.

Proof: Indeed, assume the contrary that ${v=a_1v_1+ \ldots +a_mv_m=b_1v_1+ \ldots+b_mv_m}$ with ${a_i}$ not all equal to ${b_i}$. Hence,

$\displaystyle (a_1-b_1)v_1+(a_2-b_2)v_2+ \ldots + (a_m-b_m)v_m=0,$

with ${a_i-b_i}$ not all equal to ${0}$. This contradicts to our definition. $\Box$

This proposition doesn’t help us much in testing linearly (in)dependence. But the below proposition, which is also deduced from the definition, has better application in testing, simply by looking at the list of vectors:

Proposition 3 If some vector in a list of vectors in ${V}$ is a linear combination of the other vectors, then the list is linearly dependent, i.e. list that is not linearly independent.

Proof: List ${v_1, \ldots, v_m}$ has ${v_1= \sum_{i}a_iv_i}$ then ${v_1-\sum_ia_iv_i=0}$, or there exists ${x_1, \ldots, x_m}$ not all ${0}$ so ${\sum_{i=1}^mx_iv_i=0}$. Hence, ${v_1, \ldots, v_m}$ is linearly dependent. $\Box$

From this, we develop (in)finite-dimensional test for vector space:

Proposition 4 (Ex 14,2A) ${V}$ is infinite-dimensional if and only if there is a sequence ${v_1,v_2, \ldots}$ of vectors in ${V}$ such that ${v_1,v_2, \ldots, v_m}$ is linearly independent for every positive integer ${m}$.

Proof: If ${V}$ is infinite-dimensional: Pick ${v_1 \in V}$, pick ${v_i}$ so ${v_i \in V, v_i \not\in \text{span}(v_1, \ldots, v_{i-1})}$. Such ${v_i}$ exists, otherwise any ${v \in V}$ must also in ${\text{span}(v_1, \ldots, v_{i-1})}$ so ${V \subseteq \text{span}(v_1, \ldots, v_{i-1})}$ and we also have ${\text{span}(v_1, \ldots, v_{i-1}) \subseteq V}$ so a finite list of vectors span ${V}$, a contradiction. Note that since ${v_1, \ldots, v_{i-1}}$ is linearly independent and ${v_i \not\in \text{span}(v_1, \ldots, v_{i-1})}$ so by Proposition 3 then ${v_1, \ldots , v_i}$ is linearly independent. By continuing the process, we can construct such sequence of vectors. $\Box$

Two examples taken from the book use Proposition 4 to solve:

Problem 1 Prove that ${\mathbf{F}^{\infty}}$ is infinite-dimensional.

Proof: With ${\mathbf{F}^{\infty}}$, consider the sequence ${(1,0, \ldots),(0,1,0, \ldots),(0,0,1,0,\ldots), \ldots}$ then from Exercise 4, we follow ${\mathbf{F}^{\infty}}$ is infinite-dimensional. $\Box$

Problem 2 Prove that the real vector space of all continuous real-valued functions on the interval ${[0,1]}$ is infinite-dimensional.

Proof: Consider sequence of continuous real-valued functions defined on ${[0,1]}$: ${1,x,x^2,x^3, \ldots}$ and we are done. $\Box$

1.1. Steinitz exchange lemma

From Proposition 3, we can also deduce to following lemma.

Theorem 5 (Linear Dependence Lemma) Suppose ${v_1, \ldots, v_m}$ is linearly dependent list in ${V}$. There exists ${j \in \{1,2, \ldots, m \}}$ such that the following holds:

1. (a) ${v_j \in \text{span}(v_1, \ldots, v_{j-1})}$.
2. (b) if the ${j}$th term is removed from ${v_1, \ldots, v_m}$, the span of the remaining list equals ${\text{span}(v_1, \ldots, v_m)}$.

Condition ${(b)}$ can be followed from ${(a)}$. And from this lemma, we can prove Steinitz exchange lemma, which gives a bound for length of linearly independent list in finite-dimensional vector space.

Theorem 6 (Steinitz exchange lemma) In a finite-dimensional vector space, the length of every linearly independent list of vector is less than or equal to the length of every spanning list of vectors.

This theorem is also called as Steinitz exchange lemma or Replacement’s theorem. Here is my proof for this theorem.
Proof: Let ${v_1,v_2, \ldots, v_m}$ is linearly independent list and ${u_1,u_2, \ldots , u_n}$ is spanning list of vectors. It suffices to prove for the case that ${u_1, u_2, \ldots , u_n}$ is also linearly independent, otherwise if ${u_1,u_2, \ldots , u_n}$ is linearly dependent, according to Linear Dependence Lemma (2.21), we can take out some vectors from the list without changing the span and bring back to linearly independent list. Assume the contrary that ${n and from this assumption, we will prove that list ${v_1, \ldots , v_m}$ is linearly dependent.

Since ${u_1, u_2, \ldots , u_n}$ spans ${V}$ and is also linearly independent so for each ${v_i}$, there exists unique ${c_{i1},c_{i2}, \ldots , c_{in} \in \mathbf{F}}$ so ${v_i=c_{i1}u_1+c_{i2}u_2+ \ldots + c_{in}u_n}$. Hence, for ${a_1, \ldots , a_m \in \mathbf{F}}$, we have

$\displaystyle a_1v_1+a_2v_2+ \ldots + a_mv_m= u_1\sum_{i=1}^ma_ic_{i1}+u_2\sum_{i=1}^ma_ic_{i2}+ \ldots +u_n\sum_{i=1}^ma_ic_{in}.$

We will show that there exists ${a_1, \ldots , a_m}$ not all ${0}$ so ${a_1v_1+ \ldots + a_mv_m=0}$. Since ${u_1, \ldots , u_n}$ is linearly independent so form the above, we follow

$\displaystyle a_1v_1+ \ldots +a_mv_m=0 \iff \begin{cases} a_1c_{11}+a_2c_{21}+ \ldots + a_mc_{m1}=0, \\ a_2c_{12}+a_2c_{22}+ \ldots + a_mc_{m2}=0, \\ \qquad \qquad \qquad \ldots \\ a_mc_{1n}+a_2c_{2n}+ \ldots +a_mc_{mn}=0.\ \end{cases} \ \ \ \ \ (1)$

This system of equation has ${m}$ variable and ${n}$ equation with ${n. Here is an algorithm in finding non-zero solution for ${a_1, \ldots, a_m}$. Number the equations in (1) from top to bottom as ${(11),(12), \ldots, (1n)}$.

1. For ${1 \le i \le m-1}$, get rid of ${a_m}$ in ${(1i)}$ by subtracting to ${(1(i+1))}$ times some constant. By doing this, we obtain a system of equations where the first ${n-1}$ equations only contain ${m-1}$ variables, the last equation contains ${m}$ variables.
2. For ${1 \le i \le m-2}$, get rid of ${a_m}$ in ${(1i)}$ by subtracting to ${(1(i+1))}$ times some constant. We obtain an equivalent system of equations where the first ${n-2}$ equations contain ${m-2}$ variables, equation ${(1(n-1))}$ contains ${m-1}$ variables and and equation ${(1n)}$ contains ${m}$ variables.
3. Keep doing this until we get an system of equation so that the ${(1i)}$ equation contains ${m-n+i}$ variables.
4. Starting from equation ${(11)}$ with ${m-n+1}$ variables ${a_1, \ldots , a_{m-n+1}}$, we can pick ${a_1, \ldots , a_{m+n-1}}$ so that it satisfies equation ${(11)}$ and ${a_1, \ldots , a_{m-n+1}}$ not all ${0}$, which is possible since ${m \ge n+1}$.
5. Come to ${(12)}$ and pick ${a_{m-n+2}}$, to ${(13)}$ and pick ${a_{m-n+3}}$ until to ${(1n)}$ and pick ${a_m}$.

Thus, there exists ${a_1, \ldots a_m}$ not all ${0}$ so that ${a_1v_1+ \ldots + a_mv_m=0}$, a contradiction since ${v_1, \ldots , v_m}$ is linearly independent. Hence, we must have ${n \ge m}$. We are done. $\Box$

From this proof, we can see that this theorem relates to system of linear equation. The textbook has mush better solution.

1.2. Applications of Steinitz exchange lemma

From this Steinitz exchange lemma, we can prove a theorem about finite dimentional subspaces

Theorem 7 (2.26 Finite dimensional subspaces) Every subspace of a finite-dimensional vector space is finite-dimensional.

Proof: Let ${V=\text{span}(v_1,v_2, \ldots , v_m)}$ be a finite-dimensional vector space and ${U}$ be its subspace. Among vectors in the spanning list ${v_1,v_2, \ldots, v_m}$, let ${u_1,u_2, \ldots u_n}$ be vectors that are in subspace ${U}$, ${k_1,k_2, \ldots , k_{m-n}}$ be vectors that are not in ${U}$. If ${U=\text{span}(u_1,u_2, \ldots, u_n)}$ then we are done. If not, then there exists ${\ell \in U}$ so ${\ell \not\in \text{span}(u_1,u_2, \ldots , u_n)}$. Hence, since ${\ell \in \text{span}(v_1, \ldots , v_m)}$ so

$\displaystyle \displaystyle \ell =\sum_{i=1}^nu_ia_i+\sum_{j=1}^{m-n}k_ib_i$

for ${a_i,b_i \in \mathbf{F}}$ and there exists ${b_i \; (1 \le i \le m-n)}$ so that ${b_i \ne 0}$. Since ${u_i,\ell \in U}$ so that means ${\displaystyle h_1= \sum_{i=1}^{m-n}k_ib_i \in U}$ and also in ${\text{span}(k_1, \ldots , k_{m-n})}$ and ${h_1 \ne 0}$. We add ${h_1}$ to the list ${h_1,u_1,u_2, \ldots , u_n}$ in ${U}$. Similarly, we compare ${U}$ and ${\text{span}(h_1,u_1, \ldots ,u_n)}$ and may add another ${h_2 \in \text{span}(k_1,k_2, \ldots , k_{m-n})}$ onto the list. This process will keep going until we obtain a spanning list of ${U}$. We will prove that after adding at most ${m-n}$ such ${h_i}$, we can get ${U=\text{span}(h_1,h_2, \ldots , h_{j},u_1, \ldots , u_n)}$ with ${j \le m-n}$.
Indeed, assume the contrary that after adding at least ${m-n}$ such ${h_i \in U}$ so ${h_i \ne 0}$, we still can’t obtain a list of vectors that spans ${U}$. That means there exists ${\ell \in U}$ and ${\ell \notin \text{span}(h_1, \ldots , h_{m-n},u_1, \ldots , u_n)}$. Similarly, we can construct another ${,h_{m+n-1} \ne 0, h_{m-n+1} \in U}$ and ${h_{m-n+1} \in \text{span}(k_1, \ldots , k_{m-n})}$ based on ${\ell}$. Let ${h_i=c_{i1}k_1+c_{i2}k_2+ \ldots + c_{i,m-n}k_{m-n}}$ then we have

$\displaystyle \begin{array}{rcl} & a_1h_1+ \ldots + a_{m-n+1}h_{m-n+1}=k_1 \\ \iff & \begin{cases} a_1c_{1,1}+a_2c_{2,1}+ \ldots + a_{m-n+1}c_{m-n+1,1}=1, \\ a_1c_{1,2}+a_2c_{2,2}+ \ldots + a_{m-n+1}c_{m-n+1,2}=0, \\ \qquad \qquad \qquad \ldots \\ a_1c_{1,m-n}+ \ldots +a_{m-n+1}c_{m-n+1,m-n}=0. \end{cases} \end{array}$

This is a system of ${m-n}$ equations with ${m-n+1}$ variables ${a_1, \ldots , a_{m-n+1} \in \mathbf{F}}$ so similarly to Theorem 6‘s proof, there are infinite number of ${a_i}$ so ${a_1h_1+ \ldots +a_{m-n+1}h_{m-n+1}=k_1}$. Since ${h_i \in U}$ so ${k_1 \in U}$, a contradiction. Thus, there are at most ${m-n}$ such ${h_i \in U \; (1 \le i \le j)}$ so that there does not exist ${\ell \in U}$ but ${\ell \notin \text{span}(h_1, \ldots , h_j,u_1,\ldots ,u_n)}$. Note that ${\text{span}(h_1, \ldots , h_j,u_1,\ldots ,u_n) \subseteq U}$ so this can only mean ${U= \text{span}(h_1, \ldots , h_j,u_1,\ldots ,u_n)}$. Thus, ${U}$ is a finite-dimensional vector space. $\Box$

Remark 1 Both my proofs for the two theorems use the result that a system of ${m}$ equations with ${n}$ variables in ${\mathbf{R}}$ with ${m has infinite solution in ${\mathbf{R}}$.

Back to the focus of this post, let me show how Steinitz exchange lemma can identify a particular list of vectors linearly (in)dependent. Problems below are taken from exercises in Sheldon Axler’s book.

Problem 3 Explain why there does not exist a list of six polynomials that is linearly independent in ${\mathcal{P}_4(\mathbf{F})}$.

Proof: Because ${\mathcal{P}_4(\mathbf{F})=\text{span}(1,x,x^2,x^3,x^4)}$ and from Theorem 6 (2.23) length of any linearly independent list is less than length of a spanning list ${1,x,x^2, x^3,x^4}$, which is ${5}$. That explains why we can’t have ${6}$ polynomials that is linearly independent in ${\mathcal{P}_4(\mathbf{F})}$. $\Box$

Problem 4 Explain why no list of four polynomials spans ${\mathcal{P}_4(\mathbf{F})}$.

Proof: Because ${1,x,x^2,x^3,x^4}$ is a linearly independent list of length ${5}$ in ${\mathcal{P}_4(\mathbf{F})}$ and according to Theorem 6 (2.23) length of any spanning list is greater than or equal to length of a linearly independent list, which is ${5}$. Thus, we can’t have four polynomials that spans ${\mathcal{P}_4(\mathbf{F})}$. $\Box$

Problem 5 Suppose ${p_0,p_1, \ldots, p_m}$ are polynomials in ${\mathcal{P}_m(\mathbf{F})}$ such that ${p_j(2)=0}$ for each ${j}$. Prove that ${p_0,p_1, \ldots, p_m}$ is not linearly independent in ${\mathcal{P}_m(\mathbf{F})}$.

Proof: Nice problem. Since ${p_j(2)=0}$ for each ${j}$ so that means ${x-2}$ divides each ${p_j}$. Let ${q_i=\frac{p_i}{x-2}}$ then ${q_0,q_1, \ldots , q_m}$ are polynomials in ${\mathcal{P}_{m-1}(\mathbf{F})}$. If ${q_0,q_1, \ldots, q_m}$ is linear independent but it has length ${m+1}$ larger than length ${m}$ of spanning list in ${\mathcal{P}_{m-1}(\mathbf{F})}$, a contradiction according to Theorem 6. Thus, ${q_0,q_1, \ldots, q_m}$ is linear dependent. That means there exists ${a_i \in \mathbf{F}}$ not all ${0}$ so ${\sum_{i=0}^ma_iq_i=0}$. This follows ${(x-2)\sum_{i=0}^ma_iq_i=0}$ or ${\sum_{i=0}^m a_ip_i=0}$. Hence, ${p_0, \ldots, p_m}$ is linearly dependent. $\Box$

Reading a solution using algebraic integers of a HMMT number theory problem

I tried to attack the following problem but eventually gave up.

Problem 1 (HMMT Ferb 2015 Team P9) Let ${z=e^{\tfrac{2\pi i}{201}}}$ and ${\omega= e^{\tfrac{2 \pi i}{10}}}$. Prove that

$\displaystyle A=\displaystyle \prod_{a=0}^{9}\prod_{b=0}^{100}\prod_{c=0}^{100}(\omega^a+z^b+z^c)$

is an integer and find its remainder upon division by ${101}$

The only idea I have is to expand this expression, but it was too complex to do. So I decided to read the solution using algebraic integers. I’m new at algebraic number theory and I don’t know much about algebraic integers. My only reference is Problems from the book (PFTB), Chapter 9.

Posted in Number Theory, Olympiad Math | | Leave a comment

Yellowstone Permutation – St. Petersburg MO 2015, grade 9, 2nd round

This problem was given to the St. Petersburg Olympiad 2015, 2nd round, grade 9 and is known as Yellowstone Permutation. You can see here for the article about this sequence and here for the OEIS entry.

Problem. (Yellowstone Permutation) A sequence of integers is defined as follows: $a_1=1,a_2=2,a_3=3$ and for $n>3$, $a_n$ is the smallest integer not occurring earlier, which is relatively prime to $a_{n-1}$ but not relatively prime to $a_{n-2}.$ Prove that every natural number occurs exactly once in this sequence.

M. Ivanov

Here is my proof for this interesting problem. Another proof can be found in the article given link above.

Posted in Number Theory, Olympiad Math | Tagged , , | Leave a comment

The normal approximation to the binomial distribution

A coin is flipped $n$ times with probability of getting heads is $p$. This is a binomial approximation $X \sim B(n,p)$. The Year 12 MATH B textbook gives an normal approximation to this: $B(n,p) \sim N(np,np(1-p))=N(\mu, \sigma^2)$ where $\mu=E[X]$ is the mean (or expected value) and $\sigma$ is the standard deviation. Since the textbook doesn’t give a proof for this so I will go and prove.

IMO 2016

IMO 2016 was held from 6/7/2016 to 16/07/2016. Although I didn’t attend this, I just want to post my solutions and how I think about the difficulty of the problems, which can be found here. I will post my solutions on this post. Many awesome solutions can be found on AoPS.

Posted in Contests, Olympiad Math | Tagged , , , | 1 Comment

Bài hình học Tuần 4 tháng 12 – 2015 của thầy Trần Quang Hùng

Thầy Trần Quang Hùng có một chuyên mục Mỗi tuần một bài toán trên blog của thầy. Dưới đây là bài toán của thầy Hùng đưa lên trong Tuần 4 tháng 12.

PROBLEM. Cho tam giác $ABC$ nhọn với đường cao $AH$ và tâm ngoại tiếp $O$. Đường thẳng qua $H$ vuông góc với $OH$ lần lượt cắt $CA,AB$ tại $E,F$. Gọi $M,N$ theo thứ tự là trực tâm của tam giác $OFB$$OHB$$P,Q$ theo thứ tự là trực tâm tam giác $OEC, OHC$. Chứng minh rằng $MN,PQ,EF$ đồng quy.

Sau đây là lời giải của tôi được đưa lên trong topic Mỗi tuần một bài toán trên DDTH.

Posted in Geometry, Olympiad Math | | Leave a comment