Open Questions: Elliptic Curves and Modular Forms

[Home] [Up] [Glossary] [Topic Index] [Site Map]

Prerequisites: Algebraic number theory -- The Riemann hypothesis

See also: Number theory -- Algebraic geometry -- The Langlands program

Introduction

Elliptic curves

Arithmetic and group structure on elliptic curves

L-functions and zeta functions of elliptic curves

The Birch and Swinnerton-Dyer conjecture

Complex function theory and elliptic curves

Lattices, complex tori, and the modular group

Modular functions and modular forms

Modular elliptic curves

Kronecker's Jugendtraum

Open questions


Recommended references: Web sites

Recommended references: Magazine/journal articles

Recommended references: Books

Introduction

The subject of elliptic curves is a lot like a major city, in which many highways and railroad lines converge, the airport serves as a hub of major airlines, and a seaport connects with important inland waterways. In addition, the city is home to large factories which take in a variety of raw materials and turn them into manufactured goods of many kinds for export. From this city commerce flows in many directions. In short, the city is a crossroads where people and goods from many diverse places of origin come together, interact, and eventually leave for other far-flung ports of call.

In the case of elliptic curves, the commerce is in such things as ideas, concepts, problems, and theorems rather than raw materials and finished products, but the pattern is the same. An impressive number of ideas and problems from other parts of mathematics turn up when considering even seemingly simple questions about elliptic curves. And many techniques, methods, and results arising out of the study of elliptic curves have been generalized, extended, and exported to diverse other areas of mathematics, sometimes with astonishing consequences.

The famous Last Theorem of Fermat, for example, was finally proven as a mere corollary to a deep, difficult, but beautiful theorem about elliptic curves.

Elliptic curves can be thought of in many different ways, but perhaps the simplest and most intuitive is in terms of plane curves. Everyone is familiar with plane curves such as ellipses and parablolas. These can be defined using a polynomial equation in two unknowns of the form F(x,y) = 0, where all terms in the equation have degree two or less. Such curves are nothing but the conic sections (including also circles and hyperbolas) that were extensively studied by the ancient Greeks.

If you take the same kind of equation and allow terms of degree three or less (x3 or x2y, for example), then the resulting curves are called elliptic curves. The name is confusing, as ellipses themselves are conic sections, not elliptic curves. The latter are only indirectly related to ellipses, because they occur implicitly in formulas for the arc length of an ellipse. Nevertheless, the nomenclature is firmly established, and we're stuck with it.

Elliptic curves are far more interesting than conic sections, as will be apparent from the discussion here. Historically, they have played an important role in such diverse mathematical subjects as number theory, complex analysis and Riemann surface theory, and algebraic geometry. Here are some of the active areas of mathematics to which the theory of elliptic curves has substantial connections:

And there are still a number of significant open questions specific to the theory of elliptic curves themselves, such as the conjecture of Birch and Swinnerton-Dyer which would give a much more precise description of the beautiful arithmetic that exists for points on elliptic curves.

Of course, all of this inevitably sounds rather vague and nebulous unless and until you know what an elliptic curve is in the first place. So, what is it?


Elliptic curves

To repeat, an elliptic curve is not an ellipse! The reason for the name is indirect. It has to do with "elliptic integrals", which arise in computing the arc length of an ellipse. But this happenstance of nomenclature isn't too significant, since an elliptic curve has different, and much more interesting, properties compared to an ellipse.

Nevertheless, in one sense, an elliptic curve is a type of curve that is just one step more complicated than an ellipse. The simplest of all "curves" are straight lines, each of which is the locus of points (i. e., x-y pairs) where x and y are related by an equation like this:

y = Ax + B
(For the immediate purposes here of talking about equations, capital letters will denote constants.)

The next step up in complexity would involve equations like this:

y = Ax2 + Bx + C
where the right hand side has a quadratic or "second-degree" polynomial in x. The curve which is the locus of points satisfying such an equation is a parabola. We could continue to use polynomials of higher degree on the right to produce cubic curves, quartic curves, etc. But in some sense these are not really more complicated than a parabola.

To get something that might be more complicated, we can consider equations where y also occurs as a square, such as

y2 = Ax2 + Bx + C
This equation represents an ellipse. It would still be an ellipse if there were also a first-degree term in y (because that could be eliminated by suitably "rotating" the coordinates with a change of variables). As you probably recall, both parabolas and ellipses are examples of curves known as "conic sections", because they can be obtained by cutting a cone in a suitable way. If you cut the cone parallel to a side, you get a parabola. If you cut it so as to obtain a closed curve for the cut, you get an ellipse (a special case of which is a circle, if the cut is perpendicular to the axis of the cone). Any other sort of cut yields another conic section called a hyperbola. Conic sections were studied by the Greeks -- which was a respectable achievement, since they had none of our algebraic techniques for representing curves.

Once y occurs to degree 2 in the equation of a curve, very interesting things start to happen as you raise the degree of the polynomial in x on the right side of the equation. Now something definitely new appears when the polynomial in x has degree 3 or more. In particular...

An elliptic curve (in one, narrow, sense) is the locus of points in the x-y plane that satisfy an algebraic equation of the form

y2 = Ax3 + Bx2 + Cx + D
(with an additional technical condition to avoid "degenerate" cases -- specifically, the roots of the polynomial on the right should be distinct). Such equations would seem to be somewhat less general in form than a polynomial identity such as F(x,y) = 0 where mixed terms like x2y occur. However, in an important sense, any curve defined by such an equation is equivalent to one defined by the special form given above, in fact to a curve with an even more restricted form (as explained later).

We'll see soon enough that an elliptic curve is much more interesting and complicated than any of the conic sections, but it's not easy to explain in an elementary way just why this is the case. It has to do, or course, with the fact that y appears in the equation to the second degree. The variable y is still in some sense a function of x, where y can be called the "dependent" variable and x the "independent" variable. And yet obviously the relationship is more complicated, because in order to get a value of y from any given value of x one has to take a square root. This introduces more complexity because taking square roots isn't a single-valued operation -- there are two square roots (one positive, one negative) of any positive number. This fact results in far more complexity than you might suppose. Dealing with such multiple-valued functional relationships leads directly to rather deep mathematics -- the theory of "Riemann surfaces".

Mathematicians say that in such a circumstance where variables x and y are related by a polynomial equation, in which both appear to degrees higher than 1, that y is an "algebraic" function of x (and x is an algebraic function of y). Algebraic functions began to be studied rigorously in the 19th century, and quickly yielded some very beautiful mathematics, including the theory of Riemann surfaces -- and elliptic curves.

Finding "all" solutions of polynomial equations of this kind is not in the least a trivial matter when higher degree polynomials are involved. The theory of how to find, or at least describe, such solutions, where one or more polynomial equations are involved, has grown into the large and extremely complicated area of modern mathematics known as "algebraic geometry". The theory is so-named because it generalizes and applies sophisticated algebraic techniques to the study of geometric curves and surfaces such as conic sections. One reason the theory of elliptic curves is so appealing is that such curves represent the simplest case which is still difficult enough (in spades!) to be really interesting.

So let's get back to the story at hand.

There is another level of subtlety here. So far we have been deliberately vague as to what sort of values x and y represent. In the simplest case, they are elements of the field of real numbers, denoted by R. This is what one learns to deal with in high school when graphs of curves (such as conic sections) in the Cartesian x-y plane are discussed. In this case, the coefficients of the equation (A, B, C, D) are assumed to be real numbers as well.

However, the field R is only one of a number of fields we could consider in which the equation of an elliptic curve makes sense. Abstractly, a field is defined as a set of objects which has two related group structures. Intuitively, it is fine to think of these structures as the operations of addition and multiplication which are familiar in the field R.

In a little more detail, to say that an operation corresponds to a group structure is to say that it satisfies a few axioms. Using "+" to denote the operation of "addition" and "×" to represent the operation of "multiplication", the axioms require that the operations be associative, i. e. A+(B+C) = (A+B)+C, and A×(B×C) = (A×B)×C. There must be an identity element. For the + operation in a field, the identity is usually symbolized by 0: A+0 = 0+A = A. The identity for the × operation is symbolized by 1: A×1 = 1×A = A. Except for the additive identity with respect to multiplication, every element has an inverse for both operations. "-A" is the additive inverse of A: A + (-A) = A - A = 0. "1/A" or "A-1" is the multiplicative inverse: A×A-1 = A/A = 1. (If A = 0, its multiplicative inverse is not defined, but 0 × A = 0 for any A.) In a field, both operations should be "commutative": A + B = B + A and A × B = B × A. (I. e., the groups are "Abelian".) And finally, it is required that × be "distributive" with respect to +: A×(B+C) = A×B+A×C. A field, then, is just a system of objects (that may be thought of as "numbers") which satisfy the familiar laws of arithmetic.

Besides the field R of real numbers, there are several other fields which are important in general, and for the theory of of elliptic curves in particular:

It is ambiguous to speak of an elliptic curve as defined by the equation
y2 = f(x) = Ax3 + Bx2 + Cx + D
without specifying a field K which is to contain the solutions of that equation. This field K is called the "field of definition of the curve". "Points" on the curve are then pairs of numbers (x,y) with both x and y members of K. In addition, it is assumed that the coefficients A, B, C, D are also members of K. Symbolically, f(x) ∈ K[x], where K[x] is the set of polynomials having coefficients in K.

Nevertheless, it is often convenient to leave the field of definition K unspecified. This is because when the defining polynomial is in K[x], it is also in K′[x] for fields K′ with K ⊆ K′. And if there is a solution (x,y) with x and y ∈ K (or symbolically, (x,y) ∈ K×K), then (x,y) ∈ K′×K′ also. So a point (x,y) on a curve defined over K is on the curve defined over K′ ⊇ K as well. Many things that can be said about a specific curve really depend only on the equation and not on the field of definition. Furthermore, to study the properties of "the" curve defined by a particular equation, it is often useful to study the solutions of the equation in each of the fields mentioned above.

As a matter of notation, an elliptic curve corresponding to a particular equation is often denoted simply as "E". When we are concerned with points on the curve that have coordinates in a specfic field K, we write E(K) for that set of points.

In some sense, C is the most "natural" field over which to define an elliptic curve because it has the property of being "algebraically closed". This means that every polynomial f(x) ∈ C[x] has the maximum number of possible roots (which is the same as the "degree" of the polynomial) lying in C rather than in some extension of C. Therefore, the resulting curve, E(C), which is a set of pairs (x,y) of numbers of the field, is as complete as possible.

However, some of the most interesting questions about an elliptic curve arise from considering the curve as defined over a smaller field, especially R or Q. In particular, the points E(R) can easily be graphed on a Cartesian coordinate system when the curve is considered to be defined over R. And some of the most interesting mathematics results from trying to determine the set of "rational points on the curve", i. e. E(Q), viewing the curve as being defined over Q.

So an elliptic curve is an object that is easily definable with simple high school algebra. Its amazing fruitfulness as an object of investigation may well depend on this simplicity, which makes possible the study of a number of much more sophisticated mathematical objects that can be defined in terms of elliptic curves.

We should note that there are other ways to define the notion of "elliptic curve". They can be proven to be (more or less) equivalent to the one we have used. In order to even give such definitions, however, it's necessary to use concepts and terminology from algebraic geometry. There one first defines the notion of "curve" in general (as a "variety" of "dimension" one). Next, one appeals to notions of topology to define the concept of "genus" -- roughly, the number of "holes" in the curve when considered as a Riemann surface. Finally, one defines an elliptic curve as a curve with genus = 1. Given this, it can be shown, that curves of this sort correspond to a cubic equation such as we used, and conversely. This is interesting to know if/when you get into algebraic geometry, but requires dealing with more unfamiliar concepts than necessary to begin with.


Arithmetic and group structure on elliptic curves

How did mathematicians happen to get interested in elliptic curves in the first place? One reason, as we've seen, is that they are just the next step up in complexity from conic section among algebraic plane curves. But the equations that define an elliptic curve are interesting for more than just that reason. They also represent an important case of "Diophantine" equations.

Saying that an equation is "Diophantine" is not actually saying something about the equation, but rather about the type of solution one is looking for. With a Diophantine equation (or system of equations), what is sought isn't the set of all solutions in real or complex numbers, but rather the set of solutions in rational numbers or integers. Mathematicians have been interested in this special case for over 2000 years. The term "Diophantine" itself goes back to Diophantus of Alexandria, who was the leading Greek mathematician of his time, about 250 CE, and who contributed much to the study of equations that came to be named after him. Diophantus' writings were lost for over 1000 years, but when they were rediscovered in 1570, it was found that he had originated the concept of negative numbers and techniques for solving algebraic equations in general.

However, the study of Diophantine equations didn't begin with Diophantus. For example, the sides of a right triangle satisfy the Pythagorean equation: A2 + B2 = C2. This was known to the Egyptians, and to the Babylonians before that. It was useful to them to know of integer solutions to this equation -- triples of integers (A,B,C) -- because they could be used to mark off three sections of a long cord, and this in turn could be used to construct accurate right angles for a building. It was discovered that there are infinitely many such possible triples (which became known as Pythagorean triples), and new solutions could be constructed from known solutions.

There is another geometric problem which was considered in antiquity, and in this case it leads to a Diophantine equation which is cubic and defines an elliptic curve. The problem is to determine, for an integer n, whether there are any right triangles that have an area equal to n and sides which are rational numbers -- and if there are such triangles, to determine all of them. It turns out that this leads to this Diophantine equation:

y2 = x3 + n2x
An integer n is said to be a "congruent" number if and only if a right triangle with rational sides and area n exists. It is known that certain integers are congruent (5, 6, 7, for example), while others are not (1, 2, 3, and 4). But a complete solution to the problem of giving necessary and sufficient conditions for n to be a congruent number is still an open question. It is known that n is a congruent number if and only if there are infinitely many rational solutions to the indicated equation, which amounts to the corresponding elliptic curve having infinitely many rational points. But deciding this last question is so difficult that it is still open, and it is intimately connected with a very sophisticated conjecture known as the Birch and Swinnerton-Dyer conjecture. We'll eventually explain that conjecture in more detail.

Just as with the Pythagorean equation, if our cubic equation has any rational solution at all, it will have infinitely many. This is because, given one solution, there is a procedure for generating another, and then a third, and so on. What this procedure boils down to is a way of "adding" two rational points lying on the curve to obtain another. Sometimes this procedure will terminate after a finite number of steps, because it yields a solution which has already been found. Other times the procedure can go on to generate infinitely many different solutions.

Carl G. J. Jacobi (1804-51) was the first to recognize that this procedure for "adding" two points amounts to specifying a group structure on any elliptic curve. The existence of this group structure on the set E(K) of points of the curve is one of the most important facts about elliptic curves. It makes the theory amazingly rich. This group structure is just a way of "adding" two points on the curve to produce a third point that is also on the curve, and to do this in such a way that the standard group axioms are satisfied. The term "addition" is reasonable for this operation, because it is commutative. And all this is true more or less independently of what field K the curve is defined over. (There's a minor exception in the case of fields that have "characteristic 2", i. e. fields where x+x = 0 for all x ∈ K.)

There are several different ways to define this group structure. (The fact that there are almost always several ways to look at the same thing in the theory of elliptic curves is one of the most intriguing, or most confusing, things about it, depending on your point of view.)

The approach to defining the group structure which is most transparent is to regard the curve as a geometric plane curve in the Cartesian plane R2. It is a simple fact that in this case a straight line intersects a cubic curve in either one point or three -- this is where the fact that the curve is cubic in x is crucial. (The proof is to substitute the equation of a line, y = mx + b, into the equation of the curve to yield a third degree polynomial in x alone, which has either one or three real roots.) So suppose you have any two distinct points P and Q on the curve. A line through those two points must then intersect the curve at a third point, say R = (xR,yR). R itself is not the sum of P and Q, but instead R′ = (xR,-yR) is. (If any point (x,y) ∈ E(R), then (x,-y) ∈ E(R) also.) This choice for the sum of points makes the set E(R) into a group under the + operation.

How does this work if you want to add a point to itself to get P + P? In that case, you take the tangent to the curve at P. This will also intersect the curve at another point. Here it is important that the curve have a well-defined tangent at every point. This is guaranteed if f(x) in the equation y2 = f(x) has no repeated roots, in which case the curve is said to be "non-singular" -- and this is one of the requirements for a curve to be elliptic curve.

Perhaps you can see one other difficulty with this definition if you draw a few elliptic curves. If P = (x,y) and Q = (x,-y) are on the curve, then the line joining them will be vertical. This will not appear to intersect the curve at any other point. In this case the point of intersection is defined as the "point at infinity", designated by O. This may seem like cheating, but it can be handled rigorously by using the notion of "projective coordinates" and working in the "projective plane" instead of R2. Indeed, working with a projective plane instead of the "ordinary" plane (sometimes called the "affine" plane) is standard operating procedure in algebraic geometry, because it automatically handles all the awkward special cases that have to be considered in finding where two curves intersect. The precise definition of projective coordinates isn't difficult, but we won't go into it, in order not to lengthen this discussion further.

The point at infinity O is not just some ugly kluge. It is a key ingredient in making E(R) into a group, because it turns out to be the identity element. In other words, P + O = P for all points P on the curve. (This equation is the reason O is used to stand for the point at infinity.) One can then define an inverse operation so that -P is the point Q such that P + Q = O. (Q is the unique point where a vertical line through P intersects the curve, i. e. the reflection of P across the x-axis.) One sets -O = O, of course.

All of these definitions would be meaningless, of course, unless the group axions are satisfied by the + operation. Verification of the axioms can be done using straightforward but messy algebra. Such algebra yields an expression for the coordinates of P + Q in terms of the coordinates of P and Q. This expression could have been used from the beginning as an alternative definition of the + operation, but no one would have guessed it.

However, the explicit formula for the coordinates involves only rational operations (addition, subtraction, multiplication, division -- no extraction of roots). A very important consequence is that if the points P and Q have rational coordinates, then so does P + Q. Hence + also provides a group structure for the set E(Q) of rational points on the curve. An additional consequence is that the group operation in fact makes sense over almost any field, so that E(K) has a group structure when K is, for instance, a finite field. We'll see why the finite fields are important a little later.

Let's now return to the discussion of Diophantine equations. Suppose we have a polynomial equation in two variables with rational coefficients, such as F(x,y) = 0. There are four questions we might ask about it:

  1. Does it have any solutions in integers?
  2. If it has integer solutions, are there infinitely many?
  3. Does it have any solutions in rationals?
  4. If it has rational solutions, are there infinitely many?
Most of these questions are quite difficult in general, though the answers are known for various specific types of polynomials. Only the second question has a fairly good general answer. As with all four questions, the answer depends on the degree of the polynomial F(x,y), which is defined as the highest degree of any term in the polynomial (where a term like xmyn by definition has degree m+n). If the degree is 1 or 2, it is relatively easy to determine, based on the the coefficients, whether there are infinitely many integer solutions. For example, an important theorem of Adrien Marie Legendre (1752-1833) gave necessary and sufficient conditions for the existence of rational solutions of solving quadratic equations in two variables. Further, if such an equation has any rational solutions, there is a constructive method for finding an infinite number of rational solutions.

For equations of degree 3 (elliptic curves, basically), the first results were found by Axel Thue in 1908, that certain equations can have only a finite number of integer solutions. Thue used a technique called "Diophantine approximation", and his results were generalized in the 1920s by Carl L. Siegel to show that all third degree equations have at most a finite number of integer solutions. For equations of degree greater than 3, Louis J. Mordell made a famous conjecture in 1922 that there could likewise be at most a finite number of integer solutions. This case was much harder, though, and it wasn't proven until 1983, by Gerd Faltings.

Question 1, about the existence of any integer solutions is extremely hard, and is still an open question in general.

What about rational solutions? This question is even more difficult in general. If the degree of the equation is higher than three, little is known. If the degree is exactly three, we have essentially an elliptic curve, and trying to answer questions 3 and 4 is presently where most of the theoretical action is. Mordell gave a good partial answer in 1923 (based on a conjecture of Henri Poincaré in 1901), known as Mordell's Theorem. This result states that the group E(Q) of rational points on an elliptic curve is "finitely generated". This means that, if there are any rational solutions, then they can all be determined from a certain finite subset of them.

Unfortunately, there are two things that Mordell's result does not do. First, it provides no way to tell whether any rational points exist (other than the "point at infinity"). Second, it does not provide an "effective" means (i. e. an algorithm) for finding a set of generators for the group of rational points. In some cases Mordell's methods are able to do this. And it has been conjectured, but not yet proven, that the methods will work in all cases.

There is a general theorem about finitely generated abelian groups such as E(Q). It states that any finitely generated abelian group is the "direct sum" of the subgroup consisting of elements of finite order and zero or more copies of the additive group Z of integers. Symbolically, for any such group G:

G ≅ GtZr
In this formula, "≅" means "is isomorphic to". (Isomorphic groups have identical structure and are indistinguishable as groups.) Gt is the "torsion subgroup" of G that consists of all elements of G that have finite order (i. e. for such an element g ∈ G, there is an integer n such that ng = 0). Zr is the infinite group that is the direct sum of r copies of the integers. You can think of this group as consisting of r-tuples of integers with addition being defined in the obvious way (componentwise).

The number r is called the "rank" of the group. If G = E(Q) is the group of rational points of an elliptic curve, r is called the rank of the curve. Determining r theoretically and in practice is currently the main problem of arithmetic elliptic curve theory. No effective method is known for determining whether a particular elliptic curve has an infinite number of rational solutions (i. e. whether or not r > 0), and it isn't even known whether there are curves with arbitrarily large rank, though a bound isn't considered likely to exist. There isn't any good algorithm for calculating r in particular cases, either. The Birch and Swinnerton-Dyer conjecture is all about the number r.

As it happens, much more is known about the torsion part of the group E(Q), denoted by E(Q)t. A theorem due to Elisabeth Lutz and Trygve Nagell in the 1930s showed how to compute E(Q)t in any particular case. The theorem says that if E is the curve y2 = x3+ax+b with a and b in Z, and if (x,y) ∈ E(Q)t then x and y are integers and either y=0 or y2 divides 4a3+27b2. Hence there are only finitely many possible pairs (x,y) to test for membership in the torsion subgroup, and each test can be done in a finite number of steps.

In 1976 Barry Mazur proved that only 15 possible groups can occur as E(Q)t for any elliptic curve. They are all very small groups, namely

Z/mZ for m = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12
or
Z/mZZ/2Z for m = 2, 4, 6, 8
Here Z/mZ denotes the cyclic group of order m, which is isomorphic to the set of integers modulo m under addition. (We explain this a little further in the next section.) Examples are known for which each of the possible groups actually occurs.


L-functions and zeta functions of elliptic curves

Given that the key question currently about elliptic curves is determining whether or not a given curve has infinitely many rational solutions, or more precisely, what the structure of the group of rational points is, let's look more closely at how this question might be approached.

Suppose the curve is defined by the equation F(x,y) = 0, where F(x,y) is a cubic polynomial with rational coefficients. If m is any integer, F(x,y) = 0 if and only if mF(x,y) = 0, so the roots of the equation are unchanged if we multiply by a suitable integer to clear the denominators of all coefficients. So without loss of generality we may suppose that F(x,y) has integer coefficients. If there is a rational solution (x,y) ∈ Q2, then obviously there is a solution in R2 as well.

A pervasive concept in number theory is the use of "modular" artithmetic. That is, one frequently considers the value of an integer m "modulo" a positive integer n. The value of m modulo n is simply the remainder of m on division by n. One writes m ≡ a (mod n) if a is the remainder, i. e. if m = bn + a for some integers a and b, where 0 ≤ a < n -- or more generally, if m - a is divisible by n. The set of possible remainders of integers modulo n is denoted by Z/nZ, and it has a group structure which results from performing addition in Z and then taking remainders modulo n. (In fact, it has a "ring" structure as well, if multiplication is handled the same way.)

Given that, if F(x,y) has integer coefficients, we can reduce them modulo n for any positive n ∈ Z, and any equation becomes an equivalence modulo n. Hence if F(x,y) = 0, then also F(x,y) ≡ 0 (mod n) for any n. Suppose that (a,b) is a rational solution of F(x,y) = 0. As long as p is a prime number that doesn't divide the denominator of either a or b, we can easily make a corresponding solution of F(x,y) ≡ 0 (mod p), and in fact of F(x,y) ≡ 0 (mod pm) for any m > 0. So the existence of rational solutions of F(x,y) = 0 implies the existence of solutions in Z/pmZ for most primes p and all m > 0.

Legendre's theorem alluded to previously says that for quadratic polynomials, these conditions are not only necessary but also sufficient for the existence of rational solutions of F(x,y) = 0. Unfortunately, this isn't the case for polynomials of degree higher than 2. There is no simple analogue of Legendre's theorem for cubic (elliptic) or higher degree curves. Nevertheless, in some sense it seems that there ought to be conditions involving solvability of an equation in R and suitable algebraic properties of the equation for each prime number p (and its powers) that guarantee a solution in Q. This somewhat vague idea is called the "Hasse principle". It is named after Helmut Hasse, whose name will come up frequently in further discussions of elliptic curves. Hasse, in fact, along with Hermann Minkowski proved a generalization of Legendre's theorem -- called the Hasse-Minkowski theorem -- that says a homogeneous (all terms have the same degree) quadratic polynomial in n variables has a solution in Q if and only if it does in R and in the "p-adic numbers" for all primes p. ("P-adic numbers" will be explained a little later.)

So we have, at least, a clue that we should examine solvability of the equation for an elliptic curve when the equation is reduced modulo p, for primes p. We have to be a little careful about what primes we work with. If the cubic equation is in the form

y2 = f(x) = x3 + Ax2 + Bx + C
(which can always be arranged without significantly changing the curve), then it is important that the polynomial have distinct roots. When this is so, the curve is said to be "nonsingular". This is a technical condition which is necessary in order for the group structure to be defined. If the roots of f(x) are α1, α2, and α3, then we can define a quantity
Δ = [(α1 - α2) (α1 - α3) (α2 - α3)]2
called the "discriminant" of f(x), and hence the discriminant of the curve. Δ will be an integer if the coefficients of f(x) are. Clearly, we will have distinct roots if and only if Δ ≠ 0. Therefore, we have to exclude primes p that divide Δ. The prime p = 2 is also a problem, and has to be excluded.

We are interested, then, in how many distinct points (x,y) satisfy the equation of the curve for x and y in Z/pZ. Now, these numbers modulo p actually make up a finite field which we earlier denoted by Fp, and this field has p elements. There are at most two values of y corresponding to each x. (Even in a finite field, an equation like y2 = C has at most two solutions.) Hence there are at most 2p ordinary solutions of the equation. Since it is also necessary to count one "point at infinity" as a solution, there can be at most 2p + 1 total solutions. Since the point at infinity is a solution, there is at least one solution. If N(p) stands for the number of solutions, then it satisfies

1 ≤ N(p) ≤ 2p + 1
This double inequality can also be written as
|p + 1 - N(p)| ≤ p
What this says is that N(p) ranges over values centered at p + 1 and no more than a distance of p from p + 1. Actually, N(p) must be a lot closer to p (for large p), and in fact
|p + 1 - N(p)| ≤ 2√p
This was conjectured by Emil Artin, and proven by Hasse in the 1930s. This says N(p) is "approximately" p + 1, and that makes sense, because in a finite field like Fp, half of the nonzero elements are perfect squares. (Because the elements of Fp except for 0 form a cyclic group under multiplication.) Therefore, one expects that about half of the values of f(x) are perfect squares as x ranges over Fp.

Guided by the Hasse principle, if for many prime p there are a relatively large number of solutions in Fp, then there should be many rational points on the curve, which would mean a large value for the rank r. In other words, we should expect that if N(p) is large (relative to p) for many p, then r is also large.

Of course, so far this is extremely imprecise, so we need a way to make something that we can actually compute with. For each prime p, N(p) is the piece of data that is of interest. However, the above inequalities show that we have a somewhat better understanding about how the numbers

ap = p + 1 - N(p)
behave for large p. These number represent how far N(p) is from its median value.

It turns out that if one defines a complex function of a complex variable as a certain infinite product, then the numbers ap appear as coefficients in the Dirichlet series expansion of the function. (You may want to refer to the overview of the Riemann hypothesis for a much more detailed discussion of "Euler products" and "Dirichlet series".)

The function in question is this:

L(E,s) = ∏p∤&Delta(E) (1 - app-s + p1-2s)-1p|Δ(E) (1 - app-s)-1
Here, Δ(E) is the discriminant of the curve E. The notation p|Δ(E) means that p divides the discriminant and p∤Δ(E) means p doesn't divide the discriminant. We recall that if p|Δ(E) then the curve is "singular" and the set E(Fp) doesn't have a group structure, although the number of its elements, N(p), is still well-defined, and hence so is ap. The curve is said to have "bad reduction" for such primes, and "good reduction" for all other primes.

L(E,s) is called the "L-function" of the elliptic curve E (over the field Q). More specifically, it is called a Hasse L-function from its association with Hasse. Calling it an L-function suggests it has a lot in common with Dirichlet and Dedekind L-functions that occur in the theory of prime numbers of Q and prime ideals of finite extension fields of Q as presented in connection with the Riemann hypothesis. By definition, it does have an "Euler product". And indeed, L(E,s) has an analytic continuation for all complex s and satisfies a functional equation. But this is a very deep result, which has long been conjectured, yet only established within the last few years.

However, it can be shown fairly easily from the defnitions that L(E,s) has a Dirichlet series expansion:

L(E,s) = ∑n≥1 cn/ns
whose coefficients satisfy cp = ap for prime p. This expansion is valid when the Euler product converges, which is for Re(s) > 3/2, by virtue of Hasse's bound on the size of ap.

Zeta functions

If there are L-functions for elliptic curves, are there also zeta functions? Yes, but they too are defined in what may seem to be a roundabout way. Namely, the functions are defined in terms of finite fields K, where K has q = pm elements, so that K = Fq. For any n > 0, K has an extension field Kn ⊇ K of degree n with qn = pmn elements. We consider E as an elliptic curve over each Kn, and we use the notation |E(Kn)| to denote the number of elements in E(Kn). Given all that, the zeta function of E over K is defined as a "formal power series" in the "variable" T like so:
Z(E/K,T) = exp(∑1≤n<∞ |E(Kn)|Tn/n)
where by definition
exp(X) = ∑0≤n<∞ Xn/n!
is itself a formal power series -- the "exponential". (The latter series is just the Taylor series of eX from calculus.) What's going on here is simply that we substitute one formal series into another.

There is a famous set of conjectures made in 1949 by André Weil (another very important name in the theory of elliptic curves) that pertain when the same definitions are used with any "smooth projective variety" V instead of an elliptic curve. A smooth projective variety is simply a generalization of a curve that is defined by the solution set of a system of polynomial equations instead of just a single equation. Varieties are the basic objects studied in algebraic geometry.

The Weil conjectures are as follows:

  1. Z(V/K,T) is a rational function of T, i. e. a quotient of two polynomials with coefficients in Q.
  2. There are factorizations of the polynomials occuring in the numerator and denominator of Z(V/K,T) where each factor Pi has integer coefficients and the reciprocals of the roots of Pi are complex numbers of absolute value qi/2.
  3. Z(V/K,T) satisfies a functional equation, namely
    Z(V/K,T/qn) = ±qnχ/2 Tχ Z(V/K,T)   where n and χ are the dimension and Euler characteristic of V
All these conjectures have now been proved. Weil himself did so for curves (including elliptic curves). Less than 25 years later, in 1973, Pierre Deligne provided proofs for the conjectures in the stated generality. This was one of the peak accomplishments of 20th century mathematics.

The dimension of a complex variety is its dimension as a complex manifold (having points with coordinates in the complex numbers C). A curve, in particular an elliptic curve, defined by a single polynomial equation in two variables therefore has dimension 1. The Euler characteristic of an elliptic curve is a topological invariant, which happens to be 0. Therefore, for elliptic curves, the Weil conjectures take the form:

  1. Z(E/K,T) = (1 - aT + qT2) / [(1 - T)(1 - qT)]   for some a ∈ Z
  2. 1 - aT + qT2 = (1 - αT)(1 - βT)    with |α| = |β| = √q
  3. Z(E/K,T/q) = Z(E/K,T)
It's still not obvious why it makes sense to call Z(E/K,T) a zeta function. However, if you make the substitution T = q-s, then you can define
ζE/K(s) = Z(E/K,q-s) = (1 - aq-s + q1-2s) / [(1 - q-s)(1 - q1-s)]
Then from the functional equation (3.) immediately above one has ζE/K(1-s) = ζE/K(s) -- which shows symmetry about the line Re(s) = 1/2 like Riemann's zeta function. Furthermore, from part (2.) above, we can conclude that if ζE/K(s) = 0, then |qs| = √q, and hence Re(s) = 1/2.

But, for this zeta function, that is precisely what the Riemann hypothesis states -- all (nontrivial) zeros of the function lie on the line Re(s) = 1/2. Keep in mind, this is a proven result for the zeta function ζE/K(s) which we have defined as the zeta function of an elliptic curve. The similarity of relation (2.) to the hypothesis which is still unproven about the zeros of Riemann's original zeta function is why relation (2.) is also called a "Riemann hypothesis". It is also good evidence that the original Riemann hypotheses should likewise be true.

Note that the denominator of ζE/K(s) when q = p and K = Fp is a polynomial that is none other than the polynomial whose reciprocal appears in the product forumula for the L-function L(E,s) for primes p with "good reduction". This is the explanation for the seemingly arbitrary definition of L(E,s).

There's more. Without going into too many details, if you have a congruence mod p such as f(x1, ..., xn) ≡ 0 (mod p) then in addition to solutions in Fp, it is natural to consider solutions in extension fields Fpn as well. This amounts to working in the ring Fp[x1, ..., xn]/(f), and in that ring there are maximal ideals which correspond to prime ideals of number fields. There is also a notion of "norms", like the norms of number fields.

Given all that, you can define an Euler product that we'll call ζ(s) (which is not Riemann's zeta function):

ζ(s) = ∏P (1 - (NP)-s)-1
where the product is over maximal ideals P, and NP is the norm of the maximal ideal P. Now, this norm is defined as the number of elements of the residue field of P. (A commutative ring modulo a maximal ideal is a field.) Since the residue field is a finite field of characteristic p, one can write NP = pdeg(P) for some integer that we call the degree of P, deg(P). In this notation,
ζ(s) = ∏P (1 - p-s deg(P))-1
Now make the substitution T = p-s and take logarithms to get
log(∏P (1 - Tdeg(P))-1) = ∑1≤n<∞ NnTn/n
where Nn is the number of solutions of f(x1, ..., xn) = 0 with coordinates in Fpn. (This evaluation makes use of the relation of deg(P) to the number of solutions of the equation.)

Finally, if you take the formal exp() of that series, you get (in the elliptic curve case) that the zeta function Z(E/K,T) as originally defined has an Euler product of the form ζ(s) = ∏P (1 - (NP)-s)-1, which looks very much like the Dedekind zeta function of an algebraic number field.

Elliptic curves over finite extensions of Q

So far we have been considering mostly curves defined over Q and their solutions with coordinates in Q. It turns out that much the same theory can be proven for fields which are finite extensions of Q, which are called "number fields", because they are the subject of algebraic number theory. André Weil played a large part in the developement of this generalization.

To begin with, Weil proved a generalization of Mordell's theorem, namely that the group E(K) is finitely generated, when K is any finite extension of Q. Accordingly, the group E(K) is often called the Mordell-Weil group.

Weil also defined generalizations of the Hasse L-functions L(E,s). In order to spell out this generalization for an extension K ⊇ Q we would have to talk about prime ideals in the ring of integers of K. For precision, we would have to invoke the language of algebraic number theory, which talks about such things as "places" and "valuations". The details are messy, but the basic idea is much the same as when K = Q. These more general L-functions are, naturally, called Hasse-Weil L-functions.

Lastly, Weil generalized a conjecture of Hasse's which states that the L-function of an elliptic curve (over Q or a finite extension) has an analytic continuation and satisfies a functional equation which relates values of the L-function at s and 2-s. In fact, the same is conjectured to be true of the L-functions when they are "twisted" by a Dirichlet character χ. (If L(E,s) has the Dirichlet series ∑n cn/ns, then its twist is L(E,s,χ) = ∑n cnχ(n)/ns.) The extended conjecture, of course, is known as the Hasse-Weil conjecture.

This conjecture postulates exactly what one would hope to be true, namely that the L-functions of elliptic curves (and their twists) have the same very symmetric properties as the classical Dirichlet L-functions, namely an analytic continuation and a functional equation. (One dare not ask at this point that they satisfy a Riemann hypothesis as well, which isn't established yet even for Riemann's zeta function.)

Until recently, the Hasse-Weil conjecture was known to be true in only two cases: when the elliptic curve had the property known as "complex multiplication", and when the elliptic curve had the property of being "modular". However, there was yet another conjecture to which Weil's name was associated -- the Shimura-Taniyama-Weil conjecture -- which held that all elliptic curves are modular. We'll say more about that later, but to give away the secret ahead of time, it is now known that this last conjecure is true -- so the Hasse-Weil conjecture is true as well.


The Birch and Swinnerton-Dyer conjecture

The conjecture associated with the names of B. J. Birch and H. P. F. Swinnerton-Dyer has evolved gradually. As we said, the conjecture is "all about" the rank of the finitely generated abelian group E(K), the Hasse-Weil group, when K is Q or a finite extension of Q. (In what follows, L(E,s) will mean the Hasse-Weil L-function for any number field, not necessarily Q.) The conjecture relates the rank to the value of the L-function L(E,s) at s=1. The crudest form of the conjecture simply says that the group is infinite (hence the rank is ≥ 1) if and only if L(E,s) has a zero at s=1: L(E,1) = 0.

Right away there is a problem. The Euler product of L(E,s) converges only for Re(s) > 3/2. It certainly does not converge at s=1. That doesn't mean the function L(E,s) is meaningless at s=1, however. Indeed, the Hasse-Weil conjecture already proposed that L(E,s) has an analytic continuation for all s ∈ C. This conjecture is now known to be true for all elliptic curves, but even without that, the Birch-Swinnerton-Dyer conjecture could subsume it to the extent of including the stipulation that L(E,s) is analytic at s=1.

Let's take a closer look. The Euler product contains factors of the form

(1 - app-s + p1-2s)-1 = (1 - (p+1-N(p))p-s + p1-2s)-1
If you plug in s=1 the factors are simply p/N(p). Now, by Hasse's principle, N(p) (the number of points in E(Fp)) should be large if and only if E(K) is large, i. e. infinite, with rank ≥ 1. As we saw, N(p) can't get too large. It's about p+1. So the typical term p/N(p) is a little less than 1. If the typical term didn't approach 1 too closely, then the infinite product ∏p p/N(p) should be zero. This is heuristic reasoning, of course, not rigorous, but it suggests that indeed we should have L(s,1) = 0 if and only if there are infinitely many points in E(K).

To refine this conjecture a little, we could look at the reciprocal of the infinite product. If you consider only a finite number of terms of that reciprocal, you should get a product that grows without limit as the number of terms increases if and only if the rank r ≥ 1. So perhaps we can find an asymptotic formula for this product that involves the number r. Based on extensive numerical computations for curves known to have nonzero rank, Birch and Swinnerton-Dyer conjectured the asymptotic relationship

p<x N(p)/p ∼ C(log x)r    for some constant C
This form of the conjecture still doesn't involve the L-function directly. In some cases (such as when the elliptic curve E has the property of "complex multiplication") the Hasse-Weil conjecture was known to be true, hence L(E,s) had a known functional equation. The form of this equation makes it possible to compute L(E,1) directly. Not surprisingly, it turned out to be zero. But even more could be computed, namely the limit as s → 1 of L(E,s)/(s-1)r. It was found that this tended to a finite but nonzero value C(E) as s → 1, which means that the Taylor series expansion of L(E,s) has the form
L(E,s) = C(E)(s-1)r/r! + ∑r<n<∞ cn(s-1)n/n!
This means that the conjecture can be stated even more precisely as: The rank of E(K) is r if and only if r is the exact order of the zero of L(E,s) at s=1.

But we needn't stop there. Considerably more can be said about the constant C(E). Theoretical studies suggested it should be a product of several factors. There are conjectural formulas for C(E) when K is any number field, but the simplest case is when K = Q:

C(E) = Ω R(E/Q) |E(Q)t|-2 (∏pcp) |Ш(E/Q)|
All but one of the factors in this formula are well-understood and reasonably easily calculated. Ω is called the "period" of the curve. It is the integral of a differential form over E(R). R(E/Q) is called the "regulator" of the curve. It is the volume of a fundamental cell in a certain lattice that can be constructed when the rank of the curve is nonzero. E(Q)t is just the torsion subgroup of the Hasse-Weil group. The numbers cp are all 1 unless p | 2Δ(E/Q). In that case, E has bad reduction at p, and the corresponding cp describe roughly how "bad" the curve is over the p-adic field Qp.

The last factor, |Ш(E/Q)|, is the order of a group called the Shafarevich-Tate group, after I. R. Shafarevich and John Tate. (Ш is the Cyrillic character "Sha".) This group reflects how badly the Hasse principle fails to hold for the given curve. Very little is known about Ш(E) in general, even whether or not it is finite. The Shafarevich-Tate conjecture states that the group is finite, and that conjecture is subsumed in this detailed form of Birch and Swinnerton-Dyer's conjecture. When E is a curve of known rank r and C(E) is computed, then if C(E) is divided by all of the other factors which can be computed, what is left has been found to be the square of an integer in all cases. This is precisely what would be expected for the order of Ш(E).

History and current status of the conjecture

The first concrete result that partially verified the conjecture was obtained by John Coates and Andrew Wiles in 1976, who proved that if E is a curve with "complex multiplication" and L(E,1) ≠ 0, then E(Q) is finite. The result said nothing about the converse, or what happens for curves without complex multiplication, but it was a good start.

In 1983 B. Gross and Don Zagier showed that if E is a modular elliptic curve and L(E,s) has a first order zero at s=1, then there are infinitely many rational points, so the rank is at least 1. This provides a partial converse for the Coates-Wiles result. In 1990 V. A. Kolyvagin improved on this to show, for modular elliptic curves, that L(E,1) ≠ 0 imples r = 0, thus extending the Coates-Wiles result to all modular curves. And further, L(E,1) = 0 but L′(E,1) ≠ 0 implies r = 1.

Given that all elliptic curves are now known to be modular, the conjecture (except for the explicit form of the leading term of L(E,s)) is now settled when L(E,s) ∼ c(s-1)k, with c ≠ 0 and k ≤ 1:

If the order of the zero of L(E,s) at s=1 is 0 or 1, then the rank of E(Q) is the order of the zero of L(E,s) at s=1.
It is also known that the Shafarevich-Tate conjecture is true if the order of the zero of L(E,s) at s=1 is 0 or 1.

Tantalizingly, almost nothing is known if L(E,s) has a zero of order more than 1 at s=1. For instance, it could be that the rank of E(Q) is 1, yet L(E,s) has a zero of order more than 1. Therefore, the conjecture is still an open question when L(E,1) is zero to order higher than 1. Interestingly enough, examples of elliptic curves are known with any r ≤ 12, and there is even an infinite family of curves with r ≥ 12, yet none of these cases (or any others) are known to have L(E,1) zero to an order more than 3.


Complex function theory and elliptic curves

As mentioned before, there always seems to be yet another way to look at almost anything related to elliptic curves. We're now going to look at another, somewhat deeper reason for the group structure. This will also provide a foundation for further remarks about complex function theory and topology as they relate to elliptic curves.

The case of elliptic curves in the complex numbers is especially interesting, not only because C is algebraically closed, but also because of the richness of calculus for complex functions. In particular, the equation of an elliptic curve defines y as an "algebraic function" of x. What makes algebraic functions somewhat tricky is that they are not in general single-valued. Clearly, with an equation of the form y2 = f(x), there are usually two choices for the square root of any complex number, so the appropriate value of y corresponding to any x is ambiguous.

Bernhard Riemann figured out how to solve this problem so that y could be given as a single-valued function of x for x in an appropriate domain of definition. Naturally, this domain came to be known as a "Riemann surface". The domain looks locally like a portion of the complex number plane, which means, in modern terms, that it is a 1-dimensional complex manifold. Such a manifold can be regarded as a "curve" over C, as well as a 2-dimensional manifold over R, i. e. a "surface". For y defined as an algebraic function of x by the equation y2 = f(x), where f(x) is a third degree polynomial in x, it turns out that the corresponding Riemann surface on which y is a single-valued function can be identified with the elliptic curve E defined as a locus of points in C -- namely E(C)

So an elliptic curve is a Riemann surface. In fact, it is of a special type: a compact Riemann surface of "genus" 1. There are several equivalent ways to define the numerical genus. Topologically, the genus counts the number of "holes" in a surface. For example, a surface with one hole is a torus. The converse is also true: every compact Riemann surface of genus 1 is an elliptic curve. In other words, elliptic curves over the complex numbers represent exactly the "simplest" sorts of compact Riemann surfaces with non-zero genus. We'll explain this in more detail later.

This topological equivalence of an elliptic curve with a torus is actually given by an explicit mapping involving a function, called the Weierstrass ℘-function, and its first derivative. This mapping is, in effect, a parameterization of the elliptic curve by points in a "fundamental parallelogram" in the complex plane.

That is a summary of the situation. It is very worthwhile to see step by step how this relationship between elliptic curve and Riemann surface actually comes about. In so doing, we'll also encounter some interesting functions of great importance, the so-called "modular functions" in particular.

Equivalence classes of elliptic curves

In order to do this right, we have to go back to the beginning. The most general cubic curve is defined by a polynomial equation F(x,y) = 0 where all terms of the equation have total degree at most three. So it could be as complicated as this:
ax3 + bx2y + cxy2 + dy3 + ex2 + fxy + gy2 + hx + iy + j = 0
Such an equation is very cumbersome to work with, and in fact much more complicated than we really need. The truth is that there are many equations which represent essentially the same information. For a given curve E we are first of all interested in its set E(Q) of rational points. Secondly, we are interested in the group properties of the sets E(Q), E(R), and E(C). We can therefore change the equation of the curve in any way we like as long as there is a 1-to-1 correspondence between the resulting point loci in such a way that the group structure is preserved. Such a correspondence is a group isomorphism.

It can be shown that if E is defined over Q and has at least one rational point (so E(Q) isn't empty) then there is a change of coordinates such that one has a much simpler equation:

y2 = Ax3 + Bx2 + Cx + D
and there is an isomorphism between the "old" and the "new" groups of points. Such a change of coordinates is called a "birational transformation" because it is reversible and yields a 1-to-1 relationship between the rational points.

In fact, even more can be done, and the equation can be simplified further to the form:

y2 = 4x3 - ax - b
This is called the Weierstrass normal form of the equation of an elliptic curve, for reasons which will become clearer very soon.

Another way of stating this result is that birational equivalence is an equivalence relation on curves defined by a cubic equation. The groups E(Q), E(R), and E(C) are preserved (up to isomorphism) by this equivalence relation, and in every equivalence class there is a curve whose defining equation has the Weierstrass normal form.

Recall that it was also required that the curves we are considering be nonsingular. This means they have no "singular" points where a unique tangent to the curve doesn't exist, so the curve doesn't cross itself or have a "cusp". (This was necessary to be able to define the group structure.) This property of nonsingularity is also preserved by the birational equivalence relation. It is equivalent to the condition that the defining equation have no repeated roots. But if the equation is given by y2 = f(x), that is equivalent to the requirement that the discriminant Δ of f(x) be nonzero. When f(x) is in Weirstrass normal form, Δ has an especially simple form:

Δ = a3 - 27b2
So we must have Δ ≠ 0. The importance of this condition will appear several times.

Elliptic functions

Up to this point, we have obtained (in fact, defined) elliptic curves over C as the set of points E(C) = {(x,y) ∈ C2 | y2 = f(x)} for a suitable cubic polynomial f(x). Now we are going to see that some elliptic curves, meaning the exact same set of points E(C), can be obtained in an entirely different way. We will then find that all elliptic curves can be obtained in this new way.

This new way of obtaining elliptic curves uses what are known as elliptic functions. Here again, the reason for calling these functions "elliptic" has to do with their relation to integrals used to calculate the arc length of an ellipse. These integrals (known, of course, as elliptic integrals) turn out to be exactly the way in which one can obtain an elliptic function corresponding to any elliptic curve.

At first glance, the definition of an elliptic function doesn't appear to involve either elliptic curves or elliptic integrals. Specifically, an elliptic function is defined as a doubly periodic meromorphic function. Recall that a meromorphic function is just a complex function which is defined and analytic on all of C, except for isolated poles. (A pole is a type of singularity -- a point near which values of the function are arbitrarily large.) A (singly) periodic function f(z) has the property that for some ω∈C, f(z+ω) = f(z) for all z. If this holds, then in fact f(z+nω) = f(z) for all z and all integers n. f(z) = sin(z) and f(z) = ez are examples of singly periodic functions, with ω = 2π and ω = 2πi, respectively. Similarly, a function is doubly periodic if there are two distinct ω1 and ω2 in C such that f(z) is periodic with respect to each of them, or equivalently, if f(z+mω1+nω2) = f(z) for all z and all integers m, n.

Suppose that F(z) is a doubly periodic meromorphic function (i. e. an elliptic function) with periods ω1 and ω2. If, in addition, F(z) is not a constant function, there are constraints on what periods are possible -- in particular, ω12 can't be a real number. This is because if F(z) is doubly periodic and nonconstant, it must have at least one pole, by a classic theorem due to Joseph Liouville (1809-82). Without loss of generality, it can be assumed one of its poles is at z=0. Further, since F(z) is meromorphic, it can't have poles arbitrarily close together, by another basic theorem about meromorphic functions. This implies that the ratio ω12 is not a real number, because if it were, we could approximate it by a rational number and hence choose m, n ∈ Z such that mω1 + nω2 is arbitrarily close to 0. But F(z) has a pole at 0, so it would have two poles arbitraily close to each other. Another way to say that ω12 isn't real is to say that ω1 and ω2 are "linearly independent" over R. So the net result is that the two periods of an elliptic function must be linearly independent over R.

Periodicity is a very special property that places significant constraints on the nature of functions that have the property. For example, it can be shown that there are no nonconstant meromorphic functions which are triply periodic. (Because any three distinct complex numbers must by linearly dependent over R. Take such a dependence relation, approximate the real coefficients by rationals, and you can make an argument similar to that of the last paragraph that a triply periodic function would have poles arbitrarily close together.)

Now, if we are given any elliptic function F(z) with periods ω1 and ω2 we can construct another elliptic function with the same periods and very special properties. In fact, we don't even need to start with an elliptic function F(z). We can simply start with any two complex numbers ω1 and ω2 such that ω12 isn't real.

If ω1, ω2C and ω12R, we define a lattice as a set of points in C of the form {mω1 + nω2 | m, n ∈ Z}. We'll use the symbol L or L(ω1, ω2) to designate a lattice. L consists of all sums of integral multiples of ω1 and ω2. In the notation of group theory, L can be written as a direct sum: L = Zω1Zω2. The numbers ω1 and ω2 that determine a lattice are called a basis of the lattice. In general, the basis isn't unique, but there are ways to add further conditions to specify a basis almost uniquely. When such a basis is chosen, the ratio τ = ω12 is unique for a specific lattice, provided also that τ lies in a region of the upper half of the complex plane such that -1/2 < Re(τ) ≤ 1/2, |τ| ≥ 1, and Re(τ) ≥ 0 if |τ| = 1. This particular region has some importance which will be explained later.

Given all that, for any lattice L(ω1, ω2) (and hence for any nonconstant elliptic function, whose periods determine a lattice), we can define a special function

℘(z) = ℘(z; ω1, ω2) = 1/z2 + ∑ω∈L-{0} ((z-ω)-2 - ω-2)
(The summation is over all elements of the lattice except for 0.) This definition of ℘(z) as well as the notation is due to Karl Weierstrass (1815-97), who developed much of the theory. The function is called the Weierstrass ℘-function. There are various messy details, but it can be shown that this series converges and defines a meromorphic function. From the definition, it is plausible (and not hard to prove) that ℘(z) is doubly periodic with periods ω1 and ω2.

What we've found, then, is that any nonconstant elliptic function (or lattice) determines the special elliptic function ℘(z) and lets us write down a series expansion for it. A little further computation using this explicit expression allows us to deduce one very important property of ℘(z), namely that it satisfies the differential equation:

℘′2 = 4℘3 - g2℘ - g3
where the coefficients are expressible in terms of the periods. Specifically, if we define
Gk = ∑ω∈L-{0} ω-k
then g2 = 60G4 and g3 = 140G6. Further, the power series for ℘(z), obtained by rearranging terms in the defining series, is simply
℘(z) = 1/z2 + ∑1≤k<∞ (2k+1)G2k+2z2k
Not all elliptic functions are Weierstrass ℘-functions, but all are very closely related to ℘(z). First of all, note that the set of all elliptic functions with specified periods (including constant functions) form an algebraic field, because all sums, products, and reciprocals of functions that have the periods ω1 and ω2 also have the same periods. (For the moment, assume we are talking only of functions with two specific periods.) ℘-functions have the property that ℘(-z) = ℘(z). This is an immediate consequence of the power series for ℘(z). Such a function is called an "even" function. The set of all even elliptic functions also form a field. With a bit of work one can show that this field consists of all quotients of polynomials in ℘(z), and so the field is denoted C(℘(z)). ℘′(z) has the property that ℘′(-z) = -℘′(z), which is clear from its power series as well, and such a function is said to be "odd". Although odd elliptic functions don't form a field (the square of an odd function isn't odd, for example), they all have the form of ℘′(z) times an even elliptic function. Now, any function at all can be written as the sum of an even function and an odd function. It follows that all elliptic functions are of the form g(℘(z)) + ℘′(z)h(℘(z)), where g(t) and h(t) are quotients of polynomials in the indeterminate t. In other words, all elliptic functions can be expressed rather simply in terms of ℘(z) and ℘′(z).

The net result of all this is that, given a particular lattice of periods, we can explicitly construct an elliptic function ℘(z) as a series. From that and ℘′(z), we can then easily express any elliptic function with the same periods. Moreover, we obtain a differential equation for ℘(z) with coefficients determined explicitly by the periods.

Significantly, the differential equation satisfied by ℘(z) can be interpreted in another way. What it says is nothing less than that for all z∈C, the pair of values (℘(z),℘′(z)) lies on the elliptic curve E(C) whose defining equation is in Weierstrass normal form:

y2 = 4x3 - g2x - g3
The obvious question to ask now is whether every point on the curve E(C) is obtained in this way. In order to answer this, we need more facts about ℘(z). Consider the set of points P in the complex plane defined by a parallelogram whose vertices are 0, ω1, ω2, and ω1 + ω2. P is known as the fundamental parallelogram of the lattice, with respect to the given basis. (Remember, the basis isn't necessarily unique.) P can be defined explicitly by
P = {aω1 + bω2 | a,b ∈ R, 0 ≤ a,b < 1}
The fundamental parallelogram plays a very important role, as we shall see. The entire complex plane can be covered with translated copies of P so that every point of C is in one and only one translate. Furthermore, by periodicity, any elliptic function having periods ω1 and ω2 takes the same values at corresponding points of parallel sides of P. What this means is that if we "identify" opposite sides of P in a topological sense, the elliptic function is well-defined on this new topological space. In topology, the space obtained by identifying opposite sides of a parallelogram is just a torus. In this case, since it's obtained from points of the complex plane, it's called a complex torus.

Going back to ℘(z), we note that it has a pole of order 2 at 0. (That's due to the term 1/z2 in the series expansion.) There are other poles at the other vertices of P, but none inside P. An elliptic function that, counting multiplicity, has n poles in its fundamental parallelogram (which includes 0 but not the other vertices) is said to have order n. A fundamental fact about elliptic functions of order n is that as z ranges over points of the fundamental parallelogram, the function takes every complex value exactly n times, not just the special value ∞ (which is by definition the value at poles of the function). For instance, an elliptic function of order n also has n zeros in its fundamental parallelogram.

So we know that, in particular, ℘(z), which has order 2, takes every complex value exactly twice as z ranges over P. It follows from this that every point on an elliptic curve E(C) is of the form (℘(z),℘′(z)) for the case of elliptic curves whose equation has the same coefficients as the differential equation satisfied by ℘(z). In this situation, we say that the elliptic curve is "parameterized" by the two functions ℘(z) and ℘′(z), because every point on the curve is the image (twice) of the mapping from P to E(C) given by z → (℘(z),℘′(z)).

Of course, not every elliptic curve has a defining equation in Weierstrass normal form, not by a long shot. However, we know that every elliptic curve is birationally equivalent to a curve whose equation has the required form. We're still not quite done yet, though. For suppose we have an elliptic curve with defining equation

y2 = 4x3 - ax - b
We can consider the differential equation
F′(z)2 = 4F(z)3 - aF(z) - b
But what we don't know yet is whether every such equation (i. e. with arbitrary a,b ∈ Q) has a solution F(z) which is an elliptic function, and is such that a=g2 and b=g3, where g2 and g3 are obtained as above from the periods of the elliptic function.

Suppose for just a moment that we knew this was the case. Then we would know that in every equivalence class of elliptic curves (as explained above) there is one curve which is parameterized in the indicated way by a ℘-function. And there can be only one, because the curve is completely determined as the image of the mapping from P to E(C).

So we have to look at solving the differential equation:

F′(z)2 = 4F(z)3 - aF(z) - b
Although it looks a little fearsome since it's nonlinear, it isn't really that hard to solve, since it's also of first order. If G(w) is the inverse function of F(z) so that z = G(w), then by the inverse function theorem of elementary calculus,
G′(w) = 1/F′ = (4F3 - aF - b)-1/2.
Taking the indefinite integral of this yields
z = G(w) = ∫w (4w3 - aw - b)-1/2 dw
We've deliberately solved for the inverse function G(w) to F(z), because the answer is an "elliptic integral", which arises from computing the arc length of an ellipse. This is the thing we've mentioned several times. It is how the term "elliptic" comes into the picture. The function F(z) that we want is the inverse function of this elliptic integral.

At this point, things get rather messy. There are two problems. First, in order to evaluate this integral, it is necessary to specify a path of integration, because we are dealing with complex variables. There are infinitely many ways to get from point A to point B in the complex plane, and the integral may be different along each such path. The second problem is that since the function under the integral sign involves a square root, it is not well-defined, because there are two possible values for any square root (except the square root of 0). Fortunately, it isn't necessary to pick a consistent definition of the square root everywhere. The definition need be consistent only along a particular path.

These problems can be handled, but a great deal of sophisticated machinery had to be developed to accomplish this. In order to do the job rigorously, mathematicians eventually defined precise tools such as "Riemann surfaces", "analytic continuation", "homotopy theory", and "covering spaces". To make a long story short, the final result is that elliptic integrals like G(w) can be precisely and unambiguously defined. The variable of integration is no longer regarded simply as a complex number, but as a point on a Riemann surface where the square roots can be defined as single-value functions. The paths of integration lie on the Riemann surface. The complex numbers C make up the simplest sort of Riemann surface, and it is possible to extend the notion of analytic and meromorphic functions to more general Riemann surfaces. G(w) turns out to be meromorphic, and its inverse function F(z) is also. Moreover, F(z) satisfies the original differential equation, and -- most importantly -- F(z) is an elliptic function with periods ω1 and ω2 which can be expressed explicitly as suitable elliptic integrals over closed paths.

The final result is quite beautiful. Given any lattice L ⊆ C (or equivalently two complex numbers ω1 and ω2 whose ratio is not a real number), we can construct an elliptic function ℘(z) that has ω1 and ω2 as periods and which satisfies the differential equation

℘′2 = 4℘3 - g2℘ - g3
where g2 and g3 can be expressed as simple infinite sums involving the periods. Moreover, ℘(z) determines an elliptic curve as the image of the fundamental paralellogram P of the lattice under the map φ(z) such that φ(z) = (℘(z),℘′(z)).

Conversely, if we start with an elliptic curve E(C) that has the defining equation

y2 = 4x3 - g2x - g3
we can construct an elliptic function as the inverse of an elliptic integral whose integrand involves the square root of the right hand side of the equation above. Moreover, the periods of the elliptic function can be expressed by integrals over suitable closed paths, g2 and g3 can be expressed in terms of the periods, and the elliptic function satisfies the appropriate differential equation.

The elliptic integral function has a Riemann surface on which it is unambiguously defined, and this Riemann surface is nothing other than the elliptic curve E(C). The elliptic integral makes E(C) into what is called a "double covering" of the complex plane C.


Lattices, complex tori, and the modular group

This correspondence between lattices and elliptic curves is even richer than so far indicated. However, in order to explain it we have to introduce a bit more terminology and then present some fairly elementary results about lattices -- their algebraic and topological properties. In the course of doing this we will meet an interesting algebraic structure known as the "modular group", which will play a very important part in the rest of the story. Most of this section will deal with algebraic ideas, in contrast to the "analytic" ideas involving complex functions just covered.

Equivalence relations

To begin with, we need to talk about a very fundamental idea that is simple and helps clarify various ideas througout mathematics. People have an intuitive idea of what it means for two things to be "equivalent" -- it means, roughly, the things are interchangable (in some context) even though not exactly the same. For purposes of mathematics, this idea needs to be made precise.

Suppose two abstract things we'll denote by x and y are related to each other in a way we want to describe. Symbolically, we write x R y for this state of relationship. If x and y are ordinary numbers, then x ≤ y is an example of a relation. We define an "equivalence relation" axiomatically by three conditions on R:

Obviously, not all relations one can consider have each of these properties. For instance, where numbers are concerned, the order relations < and ≤ are both transitive. ≤ is reflexive, but < is not. And neither relation is symmetric. Since ≤ and < both fail to satisfy at least one of the axioms, neither is an equivalence relation. An example of something that is an equivalence relation would be "is the same color as", with respect to objects that have one of a finite, distinguishable set of colors.

Whenever we talk about a relation R we do so in the context of a specific set of objects S (although R may be applicable to a variety of different sets). In set theoretic terms, R may be thought of as a subset of the "cartesian product" S × S, though we don't need to go further into that. All it means is that the relation may be thought of as a subset of certain ordered pairs in S × S. Symbolically, R = {(x,y) ∈ S×S | x R y}

The most important thing that an equivalence relation R does is to partition the set S into a number of different subsets called "equivalence classes". If x ∈ S, the equivalence class containing x is simply the set of all y ∈ S such that x R y. Consequently, every x is in some R-equivalence class, even if the only member of the class is x itself. Furthermore, no two equivalence classes have any elements in common, because, by the transitivity property, if two classes had any element in common, all elements of the two classes would be equivalent to each other, so there would be really only one class. Hence every member of S lies in one and only one R-equivalence class. Another way to say this is that R "partitions" S into "disjoint" equivalence classes.

Equivalence of lattices: homothety

One of the problems with the theory of elliptic curves presented above is that there can be many lattices L that correspond to the "same" elliptic curve. ("Sameness" between elliptic curves is a natural equivalence relation in the set of elliptic curves that we have alluded to above as "birational" equivalence which exists between curves that may have different defining equations.) In order to have a neat and tidy theory, we want to be able to speak about something unique which is associated with all elliptic curves of the same class.

To take care of this problem, and to give an important concrete example of equivalence classes as outlined above, we will define an equivalence relation on the set of all lattices in C, so that the resulting set of equivalence classes of lattices will, eventually, turn out to be in 1-to-1 correspondence with equivalence classes of elliptic curves.

We say that two lattices L and L′ are equivalent if there is a nonzero complex number λ ∈ C such that L′ = λL. What that last equality means is that for any α ∈ L, λ&alpha is an element of L′ (so λL ⊆ L′), and, in addition, every element of L′ is of this form. Another term used is that two such equivalent lattices are "homothetic". We shall use the latter term to avoid ambiguity, since there will be other types of "equivalence classes" that we want to talk about. In this terminology, homothetic lattices are said to belong to the same "homothety class".

There is a simple way to think of homothetic lattices. Recall that any complex number can be expressed in the form z = |z|e = |z|eiarg(z);, where arg(z) is the angle θ between the positive real axis and a line from the origin to z in the complex plane. The lattice λL has a fundamental parallelogram which is just the parallelogram of L after being stretched by a factor of |λ| and rotated by arg(λ). The interior angles of the two parallelograms are the same.

We can use the notion of homothety to simplify the way we represent (classes of) lattices. As described above, a lattice L can be defined by a basis consisting of two R-lineraly independent complex numbers ω1 and ω2: L = {mω1 + nω2 | m, n ∈ Z}. Given that, a basis for λL is simply λω1 and λω2. So, if we choose λ = 1/ω2, a basis of λL is simply the pair (&tau,1), where τ = ω12. Recall that by switching the order of ω1 and ω2 if necessary we can assume Im(τ) > 0.

As a matter of notation, we let Lτ designate the lattice with basis (τ,1). What we have so far is that for any L there is a homothetic Lτ. In fact, every homothety class contains infinitely many lattices of the form Lτ. But we will shortly find a way to make the choice of τ unique and to identify all τ that give Lτ in the same homothety class.

Complex tori and elliptic curves

First, however, we need to talk about one more algebraic concept and then relate things back to elliptic curves.

The complex numbers C have the structure of both an additive group and a 1-dimensional complex analytic manifold (essentially, a "Riemann surface"). A lattice L is an additive subgroup of C, so one has the quotient group C/L. If you aren't familiar with group theory, we can describe "quotient groups" easily using the notion of equivalence classes. We do this by saying that two complex numbers in C are equivalent if their difference is in L. That is, two points z1 and z2 are equivalent (symbolically, z1 ≡ z2) just in case z1 - z2 ∈ L. We thus get a set of equivalence classes with respect to this relation (or "modulo L".) Call this set C/L. On this set we can define a group structure simply be picking complex numbers that "represent" each equivalence class, defining the group operation on them as normal addition, and then passing to the equivalence class of the sum. This procedure is well defined and makes C/L into a group because L was a subroup of C. (One can always do this for abelian groups. For nonabelian groups, making quotient groups requires an additional condition on the subgroup.)

Another way to think of C/L is as the fundamental paralellogram P of L with addition done "modulo L": if z1, z2 ∈ P then the sum z1 + z2 is either also in P, or else corresponds to a unique element of P plus one or both basis elements of L. In either case, C/L inherits its group structure from C: the sum of two equivalence classes is just the class that contains the sum of a representative from each class.

Moreover, C/L can also be given a topological structure, derived from the topology of C, by "identifying" opposite sides of P. This means that corresponding points are considered to be the same, and a topology can be defined consistenly that reflects the identification. Think of P as being made of flexible material. When you identify one pair of opposite sides, you get a hollow cylinder. When you identify the ends of the cylinder -- the other two sides of P -- you get a surface that's like the surface of a donut. Such a surface is called a torus. So, topologically, C/L is essentially a torus. Best of all, even the analytic structure is preserved under this identification procedure, so that C/L can be made into a complex analytic manifold -- a Riemann surface. A complex manifold like C/L is called, naturally, a "complex torus". Topologically, a manifold with 2 real dimensions (or 1 complex dimension) is some sort of surface. Surfaces may be classified by the number of "holes" they have, which is an invariant known as the "genus". Since a complex torus is like the surface of a donut, it has a genus of 1.

Now recall that the Weierstrass ℘-function is doubly periodic with its periods being the basis of a lattice L with fundamental parallelogram P. This means that ℘ takes the same values at corresponding points on opposite sides of P, so that ℘ is consistently defined on C/L.

OK, but so what? Here's the most important thing: the map φ: C/L → E(C) defined by φ(z) = (℘(z),℘′(z)) has all the right properties. E(C) already has the structure of a complex analytic manifold, because it is defined as the set of zeros of a cubic polynomial in two variables. (Technically, E(C) is regarded as a subset of the "complex projective space" P2(C), but we have been trying hard to avoid discussing projective spaces, due to the extra level of abstraction involved. It's simpler, though less correct, to think of E(C) as a subset of the product space C×C = C2.) By "right properties", we mean that φ is 1:1, onto, and preserves all of the complex analytic structure.

In essence, the map φ shows that the elliptic curve E(C) is fundamentally the "same" object as C/L. But we saw that C/L has a natural group structure. This suggests that E(C) by rights "must" have a group structure as well. This group structure can be defined by using φ to transfer the group structure of C/L to E(C). Of course, it was already known that E(C) had a group structure, as we discussed early on. The rather amazing thing is that φ gives the exact same group structure. Since φ is defined using the ℘-function, the reason this works is that ℘(z) has a simple "addition formula". That is, there is a simple expression for ℘(z1+z2) in terms of ℘(z) and ℘′(z) at z1 and z2. Specifically, if z1 and z2 are not "equivalent modulo L", i. e. their difference isn't in L, then

℘(z1+z2) = (℘′(z1) - ℘′(z2))2 / 4(℘(z1) - ℘(z2))2 - ℘(z1) - ℘(z2)
In summary, if you have a lattice L, then you have determined an elliptic curve E(C) that is, in turn, analytically isomorphic to the complex torus C/L. Conversely, if you start with an elliptic curve, you can first get a birationally equivalent curve whose equation is in Weierstrass form, and from that you can get an elliptic function whose periods are the basis of a lattice L, such that the map φ gives an analytic isomorphism from the complex torus C/L to the elliptic curve E(C).

That is, there are very nice correspondences between elliptic curves, complex tori, and lattices. It is a sure sign of interesting mathematics when there are such correspondences between very different sorts of objects. But the story just gets even more interesting.

SL2(Z)

If you have studied linear algebra, then you recognize a lattice L is something like a vector space (over the real numbers R), except that only integer coordinates and integer multiples of the basis elements are involved. But all you really need to know is that any element of L can be written as a sum of integer multiples of the basis elements ω1 and ω2.

In particular, if ω′1 and ω′2 give a different basis for L, then it is possible to write

ω′1 = aω1 + bω2   and   ω′2 = cω1 + dω2
for integers a, b, c, and d. The fact that we are working with lattices is what guarantees that these coefficients are integers.

In linear algebra, this sort of relationship is called a linear transformation, and "matrix notation" is ordinarily used to express it:

(
ω′1
ω′2
) = (
1 + bω2
1 + dω2
) = (
a    b
c    d
) × (
ω1
ω2
)
In this we have represented the basis pairs such as ω1 and ω2 as "column vectors" according to the usual conventions for multiplying matrices.

Now the matrix containing the coefficients is of a special sort. Not only are its matrix elements integers, but the matrix is "invertible", because it gives a "nonsingular" linear transformation from one basis to another. That is, the inverse of the matrix also has integer entries. In linear algebra, it is shown that this implies the "determinant" of the matrix -- which is the quantity ad - bc -- must be nonzero. And since the inverse matrix also has integer elements its determinant is an integer which is the reciprocal of the original determinant. The value of the determinant must therefore be ±1. However, by switching the order of basis elements if necessary, it can be assumed that the determinant ad-bc = 1.

2 × 2 matrices of this form, with integer entries and determinant 1 are important for many reasons, some of which we shall soon see. The set of such matrices has, therefore, been given a name: SL2(Z). It is in fact a group under multiplication, called the "special linear group" (of 2 × 2 matrices with entries in Z).

The action of a group on a set

When one has a group it may be possible to use it with certain sets in order to define an equivalence relation on the set. This happens when the group "acts" in some natural way on the set. For example, SL2(Z) acts on the set of lattice basis pairs as indicated above, by means of the linear transformation corresponding to the matrix. Abstractly, for any group G that acts on a set S, then every g∈G is a mapping of S to itself. If s∈S, we write gs for the element that results from the action of g on s. The most basic requirement is a kind of associativity, that is, we must have g1(g2s) = (g1g2)s. Also, if e∈G is the group identity element, then es = s. Whenever we have a group G that acts on a set S, then G induces an equivalence relation on S by the rule that s is equivalent to s′ if and only if s′ = gs for some g∈G. Whenever we have an equivalence relation, the set S is partitioned into disjoint subsets -- equivalence classes. In this case, such subsets are called "orbits", because each equivalence class containing some element s consists of all elements of the form gs as g varies over the elements of G. Members of G then become a sort of label for the elements of any particular equivalence class.

For any finite set S it is always possible to construct groups that act on S. For instance, consider a set of 5 elements represented by the numbers 1 through 5: S = {1,2,3,4,5}. A "permutation" of S is just a rule that specifies how the elements of S can be reordered ("permuted") in some way. Let σ represent such a rule. Then one example would have σ(1)=1, σ(2)=3, σ(3)=4, σ(4)=2, and σ(5)=5. This is like a very simple kind of encryption that simply substitutes one symbol for another so that two different symbols never get mapped to the same symbol. In the example given, σ could be expressed succinctly by the notation (1)(234)(5), or more simply, (234). The group of all possible permutations on a set is called the "symmetric group". If the set has n elements, Sn is the usual notation for the symmetric group on n elements. (It's only a coincidence that S refers both to a set and to the symmetric group.) Subsets of symmetric groups are called permutation groups. We won't go into this any further, but any finite group can be "represented" in terms of a permutation group on some set.

For an example more relevant to our present interests, consider the set S = {(ω12) ∈ CC* | Im(ω12) > 0} of pairs of nonzero complex numbers. Every element of this set is a basis for some lattice L. But this isn't a 1-to-1 relationship, since many elements of S can generate the same lattice -- choice of basis is not unique. We can get a much smaller set by considering equivalence classes of such pairs, where the equivalence classes are defined by the condition that two pairs are related by a linear transformation (whose matrix is) in SL2(Z). Now the members of any particular equivalence class are labeled by elements of SL2(Z).

Let G = SL2(Z). Then there is a 1-to-1 correspondence between equivalence classes of S under the action of G and distinct lattices. Within any particular equivalence class -- a G-orbit -- corresponding to a particular lattice L, each element of the class is a specific basis pair for L, and each is labeled by an element of G. In fact, the labeling is unique, because if M1 and M2 are two matrices of G that take some basis pair s∈S to the same thing -- so M1s = M2s -- then (M1-1M2)s = s. But the only element of G that leaves some basis pair unchanged is the identity element I of G (that is, the diagonal matrix with a=d=1, b=c=0), and so M1 = M2. In a case like this, we say G acts "faithfully" on S. The net result is that there is a 1-to-1 correspondence between basis pairs for a particular lattice L and elements of SL2(Z). Since the latter is a pretty large set, it's clear that there are a lot -- infinitely many -- of basis pairs for any lattice L.

To summarize: the equivalence classes of S under the action of G correspond to distinct lattices, and members of each equivalence class correspond to elements of G.

Homothety classes and complex tori

Homothety classes of lattices can also be viewed as resulting from the action of a group on a set. In this case, the group G=C* is the multiplicative group of nonzero complex numbers. The set S is the set of all possible lattices. The action of some λ ∈ G on a lattice L is simply the lattice L′ = λL. In general, L ≠ λL. The orbit of any L under the action of G is simply the homothety class of L. The equivalence classes are the set of homothety classes of lattices.

Suppose we have two homothetic lattices L and λL. What can we say about the corresponding complex tori C/L and C/λL? It turns out that by rather straightforward algebra that they are isomorphic. The details can be worked out by considering the map λ: CC given by multiplcation by λ. This induces a map of C/L to C/λL, because two numbers z1 and z2 are equivalent modulo L if and only if λz1 and λz2 are equivalent modulo λL.

Since we know that there is a correspondence between elliptic curves and complex tori, there is also a correspondence between elliptic curves and homothety classes of lattices. Because of this, we want to understand homothety classes better.

The modular group

These considerations lead to the natural question: is there some set S and group G that acts on S such that the distinct orbits of S under G are in 1-to-1 correspondence with homothety classes?

The answer is yes. Let S be the set of lattices of the form Lτ = {aτ + b | a,b ∈ Z}, where Im(τ) > 0. We know that every homothety class contains at least one lattice selected from S. In fact (we will find) there are infinitely many elements of S in every homothety class. Yet, of course, not all members of S are homothetic to each other. The interesting question is exactly when Lτ and Lτ′ are homothetic if τ ≠ τ′.

So suppose Lτ′ = λLτ. Since (τ,1) is a basis of Lτ, (λτ,λ) is a basis of λLτ. Hence there are integers a, b, c, d such that

τ′ = aλτ + bλ    and    1 = cλτ + dλ
For future reference, note that this implies λ = 1/(cτ + d). From these equations it follows that
τ′ = (aλτ + bλ)/(cλτ + dλ) = (aτ + b)/(cτ + d)
This suggests that for
M = (
a   b
c   d
) ∈ SL2(Z)
we define an action of M on an element of S by the rule M(Lτ) = Lτ&prime where τ′ is given by the formula above. Note that this action of SL2(Z) is subtly different from the action we discussed earlier on the set of lattice basis pairs -- yet the close relationship is obvious, for if basis pairs are related such that
ω′1 = aω1 + bω2   and   ω′2 = cω1 + dω2
then the ratios of pair elements
τ = ω12   and   τ′ = ω′1/ω′2
satisfy
τ′ = (aω1 + bω2) / (cω1 + dω2) = (aτ + b)/(cτ + d)
just as before. The net result of these calculations is that the orbits of S under this action of G correspond to homothety classes of lattices. Within each orbit there are many lattices Lτ, each one corresponding to a different element of G.

Almost. We have to make a slight qualification here, because for any M ∈ SL2(Z), it is clear that -M acts on S in exactly the same way as M. (This is not the case in the earlier example of SL2(Z) acting on the set of lattice basis pairs.) In other words, the action of SL2(Z) on S isn't quite "faithful" in the sense defined above.

However, a small change takes care of this problem. We define the group Γ = SL2(Z)/{I,-I}. This is the quotient group SL2(Z) modulo the 2-element subgroup of the identity element I and its negative. What this means is that in Γ no distinction is made between a matrix and its negative. Another name sometimes used for the modular group is PSL2(Z). (PSL = "projective special linear".)

Γ is the group which is known as the modular group. (The fact that Γ is the Greek equivalent of G attests to the fundamental importance of this group.) For the most part we will continue to describe elements of Γ as if they were matrices in SL2(Z) without being fussy about the difference. The technical advantage of using Γ instead of SL2(Z) is that the group action is faithful, and there is a 1-to-1 correspondence between elements of the group and members of any particular equivalence class.

Just one more observation along these lines, but a crucial one. It is that there is also an action of Γ on the upper half of the complex plane: H = {Z∈C | Im(z) > 0}. We've already seen what it is: for any M∈Γ and τ∈H, let M(τ) = (aτ + b)/(cτ + d).

Several things need to be true in order for this to be a valid action of Γ on H. We didn't give the details before in the case of the action on the set of lattices of the form Lτ, but, for instance, it needs to be checked that M(τ)∈H whenever τ∈H. However, since one can compute

Im(M(τ)) = det(M) Im(τ)/|cτ + d|2
and det(M) = 1, this follows. We also need to have M2(M1(τ)) = (M2M1)(τ), where M2M1 is the matrix product, but that is also an easy calculation.

Functions having the form just given for M(τ) are obviously very simple rational functions, but they turn up quite frequently and have been extensively studied. Hence they have been given various names, such as "linear fractional transformations", "fractional linear transformations", and "Möbius transformations". (A. F. Möbius (1790-1860), who came up with the "Möbius band", was one of the mathematicians who studied such functions.) Möbius transformations also play an important role in some forms of non-Euclidean geometry, for example.

What do the equivalence classes of H under the action of Γ look like? At this point we need to define the subset F⊆H by

F = {z∈H | -1/2 < Re(z) ≤ 1/2; |z| ≥ 1; Re(z) ≥ 0 if |z|=1}
F may be described as a semi-infinite rectangle of width 1 centered on the imaginary axis of H, except that the lower boundary is a portion of the circle |z|=1. F is called the "fundamental domain" for the action of Γ on H. It can be shown, though the proof is a little tedious, that F has the property that no two points of F are Γ-equivalent, yet every point of H is Γ-equivalent to (exactly) one point of F. In other words, every equivalence class of H under the action of Γ is represented by exactly one point of F. Or in still other words, there is a 1-to-1 correspondence between equivalence classes of H under the action of Γ and points of F. The points of F thus provide unique "labels" for the set of equivalence classes. Within each equivalence classes, the points are uniquely labeled by elements of Γ.

There is a standard notation used for this set of equivalence classes: H/Γ. (This is often written Γ\H, since one usually writes the action of M∈Γ on z∈H as Mz, with M on the left.)

Now, H and F are 1-dimensional complex manifolds as subsets of C. There is a standard way to give a manifold structure to the equivalence classes H/Γ. When this is done, the 1-to-1 correspondence between H/Γ and F is a complex manifold isomorphism. Hence, up to isomorphism, the set of equivalence classes H/Γ is F.

But wait a minute. We already noted that there are 1-to-1 correspondences between homothety classes of lattices, complex tori, equivalence classes of elliptic curves, and equivalence classes of H under the action of Γ. To this list we can now add "points of F". In some sense, which we will explore further, F is a kind of master index to equivalence classes of each of these other things, in that each point of F is a unique label for a whole equivalence class. F is also, in a natural way, a complex manifold with a nice topological structure. Topological spaces of this kind are sometimes called "moduli spaces". They have great theoretical value since in some sense they parameterize a whole class of objects, such as elliptic curves.

We noted before that in every homothety class, we can find many lattices Lτ with a basis pair of the form (τ,1) with Im(τ) > 0. The properties of Γ and its fundamental domain F imply that the choice of τ is unique if we require τ∈F. This has interesting consequences.

Eisenstein series

We are now in a position to think of τ as a variable that ranges over all complex numbers in the half-plane H. For each τ there is a corresponding lattice Lτ, a Weierstrass function ℘(z) = ℘(z; &tau, 1), and a differential equation
℘′2 = 4℘3 - g2℘ - g3
satisfied by ℘(z). The coefficients of this equation are given explicitly by
g2 = 60G4 = 60∑′ω∈Lτ ω-4 = 60∑′m,n∈Z (mτ + n)-4
and
g3 = 140G6 = 140∑′ω∈Lτ ω-6 = 140∑′m,n∈Z (mτ + n)-6
(The primes on the summation signs are there as a reminder that the lattice point 0 is omitted from the sums.)

Sums of this form are called Eisenstein series, after Ferdinand Eisenstein (1823-52), who was among the first to study their properties. Obviously, these series are functions of τ, and in fact they can be shown to be analytic functions of τ for τ∈H. In fact, that is true more generally, if k ≥ 2 and we set

G2k(τ) = ∑′m,n∈Z (mτ + n)-2k
(We will see shortly why these are usually considered only for sums containing even powers.) These analytic functions have a very interesting property. Suppose Lτ′ = λLτ is a lattice homothetic to Lτ. Then τ′ = M(τ) for some M ∈ Γ, and
G2k(M(τ)) = G2k(τ′) = ∑′ω∈Lτ′ ω-2k = ∑′ω∈Lτ (λω)-2k = λ-2k∑′ω∈Lτ ω-2k = λ-2kG2k(τ)
However, from earlier we had λ = 1/(cτ + d), and so
G2k(M(τ)) = (cτ + d)2kG2k(τ)
This property exhibits a type of symmetry of the functions G2k(τ), and it is the defining characteristic of what are called "modular functions". It is hard to exaggerate the importance of such functions in the theory of elliptic curves.


Modular functions and modular forms

Our next main objective is to explain how "modular functions" and "modular forms" are related to elliptic curves. This will enable us to describe some additional, very deep properties of elliptic curves -- which have been proven only quite recently, in connection with Andrew Wiles' proof of Fermat's last theorem. As a first step, of course, we have to spell out the definitions of modular functions and modular forms.

You should be aware that the relevant terminology isn't completely standard. If you look at other references, you will often find slightly different forms of these definitions. That shouldn't present any real problem as long as you are careful to understand what definition is used in any given context.

To begin with, suppose that f(z) is a function that is meromorphic on the upper half plane H and that the following is true for some integer k:

f(M(z)) = f((az+b)/(cz+d)) = (cz+d)kf(z)    for all M∈Γ and z∈H
Note that the special transformation T(z) = z + 1, corresponding to a matrix with a=b=d=1, c=0, is an element of Γ. It follows that for such a function, with any k∈Z, f(z+1) = f(T(z)) = f(z), so f(z) is periodic with period 1.

A basic fact about such functions is that they can be expressed in a "Fourier series" expansion like so:

f(z) = ∑n∈Z anqn    for anC, and where q = e2πiz
Such a series is often called a q-expansion.

Next, suppose additionally that the Fourier series has a special form, with only finitely many an ≠ 0 if n < 0, so that in fact

f(z) = ∑N≤n<∞ anqn    for some N∈Z
This condition is sometimes expressed by saying f(z) is "meromorphic at infinity". Given all that, f(z) is defined to be a modular function for Γ "of weight k".

Some variations on this that you might see require k=0 (a modular function of weight 0 in the present terminology) or use some subgroup of Γ instead of the full modular group (resulting in a less restrictive definition).

Next, a modular form is a modular function whose Fourier series has coefficients an=0 for all negative n. This condition can also be stated as the requirement that f(z) be holomorphic on H (and at infinity). Finally, f(z) is said to be a cusp form if it is a modular form for which a0 = 0.

As an example, consider the Eisenstein series, which we observed to be modular functions. It isn't a difficult fact that they have the q-expansions (when k is even):

Gk(z) = 2ζ(k)[1 - (2k/Bk)∑1≤n<∞ σk-1(n)qn]
In this expression, ζ(z) is the famous Riemann zeta function, σk(n) is the arithmetic function defined by sums of the kth powers of divisors of n (including 1 and n itself):
σk(n) = ∑d|n dk
and Bk are rational numbers known as the Bernoulli numbers, defined as coefficients of a power series:
x/(ex - 1) = ∑0≤k<∞ Bkxk/k!
This representation of Gk(z) is intriguing since it is a product of a value of ζ(z) at a positive integer and a q-series whose coefficients are rational numbers. Whether or not this has any deeper meaning, it does show that the Eisenstein series are modular forms, not just modular functions, though they are not cusp forms.

Modular functions have been studied very extensively in their own right, apart from their relation to elliptic curves, since they have many applications in number theory and other parts of mathematics. We'll mention a few of the simpler facts about them.

The function j(z)

L-functions of modular forms


It seems we have a little unfinished business to take care of. The concept of a modular function arises naturally in the process of clearing this up. The unfinished business concerns the fact that if L and L′ are two lattices, then the complex tori C/L and C/L′ can be isomorphic (as groups) -- symbolically C/L ≅ C/L′ -- and in fact topologically equivalent ("homeomorphic") as complex manifolds, even if the lattices L and L′ are different. This is an issue because of the maps φ: C/L → E(C) from complex tori to elliptic curves. Such a map is defined using a basis of the lattice L to get a ℘-function having the basis elements as periods, and from that getting the elliptic curve parameterized by ℘(z) and ℘′(z). The question is: if C/L ≅ C/L′, what is the relation between the corresponding elliptic curves E(C) = φ(C/L) and E′(C) = φ(C/L′)?

As a first step in answering this, we observe that there is a simple sufficient condition for C/L ≅ C/L′. So suppose L and L′ = λL are homothetic. Let E = φ(C/L) and E′(C) = φ(C/λL). Finally suppose the equation of E in Weierstrass form is

y2 = 4x3 - g2x - g3
and the equation of E′ is
y′2 = 4x′3 - g′2x′ - g′3
Then when one works through all the algebra, it turns out that one equation comes from the other by a simple change of variables:
x′ = x/λ2    y′ = y/λ3
Furthermore, the coefficients of these equations are related as follows:
g2 = λ4g′2    g3 = λ6g′3
Now suppose ω1 and ω2 are a basis of L, &tau = ω12, and Im(τ) > 0. Let L′ = λL have basis ω′1 and ω′2 with τ&prime = ω′1/ω′2, and Im(τ′) > 0. To say that L′ = λL means we can write λ&omega1 = aω′1 + bω′2 and λ&omega2 = cω′1 + dω′2 for integers a, b, c, d.

If you know a little linear algebra, you will recognize that the coefficients a, b, c, d correspond to a 2×2 matrix
(
a b
c d
)
that defines a linear transformation between the two lattices. Since this transformation is invertible, the matrix is an element of a matrix group called SL2(Z), which consists of all 2 × 2 matrices having integers as entries and determinant, which is ad - bc, equal to 1. (Switching the two rows of such a matrix amounts to interchanging ω1 and ω2 and also switching the sign of the determinant. An integer matrix with determinant -1 also has an inverse that is an integer matrix, but avoiding ambiguities like this is one reason for ordering the basis elements so that their ratio has positive imaginary part.)

Several simple but important observations follow immediately. First,

τ = ω12 = λω1/λω2 = (aω′1 + bω′2) / (cω′1 + dω′2) = (aτ′ + b) / (cτ′ + d)
This leads to several other observations. If we take λ = 1/ω2, then L′ = λL is a lattice equivalent to L that has the basis (τ,1). Hence there is at least one lattice in every equivalence class that has a basis of the form (τ,1) with Im(τ) > 0. In fact, there are many possible bases of this form in each homothety class, as we shall see presently.

Unless you have already studied this subject thoroughly, things may seem pretty confusing by now. The primary interest here is elliptic curves, but we seem to have drifted off to talking about things like "lattices" and "complex tori", and to showing a few relationships between them. Of course, there is a method to the madness. Our goal now is to be able to talk not just about a single elliptic curve and its properties, but about the set of all elliptic curves. We will see that there is some structure and order to this set which describe the ways different curves are related to each other. The set of all lattices (over C) and the set of all complex tori have a similar structure to them, and the structure within is related.

So let's go back to the set of all lattices. For any particular lattice L, we know there is a basis consisting of two nonzero complex numbers ω1 and ω2. We know that the ratio τ = ω12 is not a real number, and the order of the two numbers can be chosen so that we may assume Im(τ) > 0. However, this choice of basis is hardly unique. Therefore, simply taking pairs of complex numbers which have a ratio whose imaginary part is positive doesn't give us a very helpful "label" to associate with the lattice.

SL2(Z) is a very important group. It is (almost) a group called the "modular group", which will play an absolutely fundamental role in the theory of elliptic curves that we are leading up to. SL2(Z) is a subgroup of SL2(C), the group of all 2×2 matrices with entries in C and determinant 1. This group, in turn, is a subgroup of GL2(C), the group of all 2×2 matrices with entries in C and nonzero determinant. (The condition on the determinants of the matrices in these groups is what ensures that the matrcies have inverses.) These are just a few simple examples of "Lie groups" -- essentially matrix groups that have related algebraic and topological structures. This theory is interesting, extensive, and deep, but we won't go into it further at this point.

Can we put further conditions on basis pairs to substantially reduce the number of distinct basis pairs for a given lattice L? That could be done, but it turns out not to be the right question to ask. Remember, we earlier looked at homothety classes of lattices -- all lattices of the form λL for some lattice L and λ∈C*. We can describe the set of homothety classes in terms of group actions. Here the underlying set S is the set of lattices, and G = C* is the multiplicative group of nonzero elements of C. The set of homothety classes is just the set of equivalence classes of s under the action of C*. The elements of any particular equivalence class are labeled by elements of C* -- a huge set.

The set of homothety classes of lattices is the right set to look at for our purposes, for several reasons. In the first place, we get the same (up to isomorphism) complex torus C/L for any L in the same homothety class. That's a good thing because of the further correspondence of complex tori to elliptic curves. But the set is nice to work with for another reason as well. If S is now the set of homothety classes, then we can define a new and different action of G = SL2(Z) on this S. But first we have to define an action of SL2(Z) on the upper half plane H = {z∈C | Im(z) > 0}.


It is customary to define a slightly different form of this:

Δ = -16(27b2 + 4a3) = -24(27b2 + 4a3)

We say that two elliptic curves are isomorphic if they have defining equations which are the same under some change of coordinate system. Since we can always change coordinates to put the equation in the normal form, we only need to work with that form. However, that form still isn't quite unique - there are different equations in normal form that define isomorphic elliptic curves. In other words, there are coordinate transformations that change the coefficients but preserve the normal form. Such transformations thus lead to isomorphic curves which have different discriminants.

However, it turns out that the quantity

j = 123 4a3 / (4a3 + 27b2) = -123 (4a)3 / Δ
is invariant no matter what normal form of the equation is used. This is called the j-invariant of the elliptic curve. Two elliptic curves are isomorphic if and only if they have the same j-invariant. (The reason for the constant coefficient 1728 = 123 is that j, being dependent on the lattice periods ω1 and ω2, has an explicit formula in terms of them out of which 1728 falls out naturally.)

Although the discriminant of a defining polynomial isn't an invariant of an elliptic curve, it is close. It happens that there is a related quantity called the minimal discriminant that is invariant. If we consider all equations in normal form for the same elliptic curve, we can choose the one whose discriminant has the fewest distinct prime factors. That discriminant is the minimal discriminant.

The most important fact about the minimal discriminant is that the primes which divide it are precisely the ones at which the curve has bad reduction. In other words, except for those primes, the reduced curve is an elliptic curve over Fp.

There is still another invariant of an elliptic curve E, called its conductor, and often denoted simply by N. The exact definition is rather technical, but basically the conductor is, like the minimal discriminant, a product of primes at which the curve has bad reduction. Recall that E has bad reduction when it has a singularity modulo p. The type of singularity determines the power of p that occurs in the conductor. If the singularity is a "node", corresponding to a double root of the polynomial, the curve is said to have "multiplicative reduction" and p occurs to the first power in the conductor. If the singularity is a "cusp", corresponding to a triple root, E is said to have "additive reduction", and p occurs in the conductor with a power of 2 or more.

If the conductor of E is N, then it will turn out that N is the "level" of certain functions called modular forms (not yet defined) with which E is intimately connected.

We might add a few more words about the j-invariant. It is a complex number that characterizes elliptic curves up to isomorphism: two curves are isomorphic if and only if they have the same j-invariant. Not only that, but for any non-zero complex value, there actually exists an elliptic curve with a j-invariant equal to that value. So there is a 1-1 correspondence between (isomorphism classes of) elliptic curves and C*.

Now, we have already seen that an elliptic curve as a complex torus is essentially determined by the period lattice of the ℘ function that parameterizes the curve. More precisely, two tori are isomorphic if and only if their corresponding lattices are "similar", that is, if and only if one is obtained from the other by a "homothety", i. e. multiplication by a non-zero complex number.

But there is another way to characterize similar lattices. Suppose we have two lattices. Each has a Z-basis of the form {ω1, ω2}. Applying a homothety, we can just consider the period ratios and assume the two bases are {1, τ}, {1, τ′}, with both τ and τ′ in the upper half plane H = {z | Im(z) > 0}. These define the same lattice if and only if they are related by a transformation in SL2(Z). This latter is essentially what is known as the "modular group" Γ. So there is a 1-to-1 correspondence of similar lattices and elements of H/Γ.

In summary, there are 1:1 correspondences between each of the following

Returning to the j-invariant, it is the 1:1 map between isomorphism classes of elliptic curves and C*. But by the above it can also be viewed as a 1:1 map j: H/Γ → C. j is therefore an example of what is called a modular function. We'll see a lot more of modular functions and the modular group. These facts, which have been known for a long time, are the first hints of the deep relationship between elliptic curves and modular functions.


Modular elliptic curves


Kronecker's Jugendtraum

Leopold Kronecker (1823-91) was one of the greatest number theorists of the 19th century. Unfortunately (for him) he is often remembered for his ultra-conservative philosophical views about the nature of "valid" mathematics. For instance, he was violently opposed to the revolutionary concepts of set theory introduced by his student Georg Cantor (1845-1918). He insisted that the only valid foundation for mathematics was the integers, and that nothing which could not be constructed out of the integers was worthy of consideration. He had no use for proposed theories of irrational numbers, such as those due to Richard Dedekind (1831-1916) and others. Ironically, the topic we're going to discuss here, which Kronecker pioneered, illustrates a profound connection between discrete algebraic objects ("algebraic numbers") and continuous analytic objects (Weierestrass ℘-functions in this case).

In spite (or perhaps because) of his philosophical idiosyncrasies, Kronecker made fundamental contributions to number theory, as we shall see. In the part of the theory we're going to discuss, Kronecker was fascinated by a rather deep fact about algebraic numbers. In technical terms, this is the fact every "abelian" algebraic extension of the rational numbers Q is contained in an extension of Q generated by "roots of unity", a so-called "cyclotomic extension".

Algebraic number theory

To proceed, we'll take a brief detour from elliptic curves and recall some terminology and facts from algebraic number theory (which we deal with more extensively elsewhere). To begin with, an "algebraic number" is simply any root of a polynomial equation f(x)=0, where f(x) has coefficients in the rational numbers. The set of all such polynomials is denoted by Q[x]. If
f(x) = anxn + ... + a1x + a0
where the coefficients ai are in Q and an≠0, then we say that n is the degree of the polynomial. If f(x) has degree n and an=1, the polynomial is said to be "monic".

Algebraic integers are an important special case of algebraic numbers. By definition, an algebraic integer is the root of a polynomial equation f(x)=0 where all coefficients of f(x) are integers and f(x) is monic. This includes ordinary integers in the case that f(x) has degree 1. As defined, such algebraic integers have been found to be the most natural generalization of the ordinary integers Z. Many problems of Diophantine equations which require solutions to be found in Z (or possibly Q) are best analyzed in terms of algebraic numbers. This was very true, for example, with Fermat's equation xn + yn = zn. There were various problems understanding how to deal rigorously with such equations until the properties of general algebraic numbers were well understood.

Q and C are examples of mathematical systems called fields, as we mentioned early on. Fields have two laws of composition which correspond to ordinary addition and multiplication. A field has a group structure with respect to both laws of composition, except that 0 (alone) lacks a multiplicative inverse. A ring is very similar to a field, except that not all ring elements need have multiplicative inverses. The integers Z make up a very typical example of a ring (a commutative ring, since the operation of multiplication is commutative in Z).

Algebraic numbers can always be regarded as being elements of some field F intermediate between Q and C, i. e. where Q⊆F⊆C. Any time two fields F and K are related such that F⊆K, one says that F is a subfield of K, and K is an extension field of F. Any algebraic number α is contained in some field F that is an extension of Q. There is always a smallest subfield F⊆C that contains α, in the sense that F is contained in any other field that also contains α. (C necessarily contains such a field, since the "fundamental theorem of algebra" says that C contains the roots of all f(x)∈Q[x].) This smallest field is written Q(α). It consists of all quotients f(α)/g(α), where f(x) and g(x) are polynomials in Q[x]. Q(α) is said to be obtained by "adjoining" α to Q.

The set of all algebraic integers that are contained in a particular field F also form a ring, called the ring of integers of F. Such rings are natural generalizations of Z, and make up a large part of the subject matter of algebraic number theory.

For a long time in the history of algebra, mathematicians hoped to be able to express the solutions of polynomial equations f(x)=0 by means of "radicals", that is, using expressions involving only the usual arithmetic operations plus extraction of roots (square roots, cube roots, etc.). Finally in 1824, Niels Henrik Abel (1802-1829) showed that not all equations involving polynomials of fifth degree or higher could be solved by radicals. This negative result was unfortunate. It had been found that all quadratic, cubic, and quartic equations could be solved by radicals. Although the general solutions of such equations could be unwieldy (especially for quartics), an inability to express some solutions using radicals at all was even more inconvenient for practical and theoretical computation alike. The abstract theory of algebraic number fields, however, eventually more than made up for lack of explicity solvability by radicals -- at least for theoretical purposes.

For both practical and theoretical reasons, mathematicians wanted convenient ways to express solutions of polynomial equations. One alternative which sometimes made up for the lack of expression of solutions by radicals was the use of "roots of unity" to express solutions. These were especially useful, for example, in the case of Fermat's equation. A root of unity, conventionally denoted by the symbol ζ (Greek zeta), is defined to be some solution of an equation having the simple form xn - 1 = 0, for some integer n. 1 is always a solution, of course. -1 is a solution of x2 - 1 = 0. i = √-1 is a solution of x4 - 1 = 0. The fundamental theorem of algebra says that xn - 1 = 0 always has n roots in C. In general, such roots may be repeated and therefore not distinct (which is the case of the equation (x-1)2 = 0, for example), but for xn - 1 = 0 the roots are known to be distinct, and can in fact be expressed in the form ζ = e2πik/n, for integers k, 0≤k<n. ζ is said to be a primitive nth root of unity if n is the smallest integer such that ζn = 1. -1, for instance, is a fourth root of unity, but not a primitive fourth root of unity, like i.

What does it mean to raise the number e to any power, especially one involving arbitrary complex numbers? The answer is that it is done rigorously with an infinite series. In fact, the series

ez = ∑0≤n<∞ zn/n!
can be shown to converge for any z∈C.

The Kronecker-Weber theorem

Various mathematicians noticed that many algebraic numbers could be expressed in terms of sums of nth roots of unity for some n. C. F. Gauss (1777-1855), in particular, made many discoveries about such sums, which therefore became known as "Gauss sums". The question therefore arose as to exactly what sorts of algebraic numbers could be expressed in terms of nth roots of unity. It was Kronecker's achievement to come up with an answer to this question, in 1853. The proof of his theorem wasn't quite airtight, but it was completed by Heinrich Weber (1842-1913) in 1886. Accordingly, the theorem is usually called the Kronecker-Weber theorem.

Suppose α is an algebraic number (not necessarily an algebraic integer). The question to be answered -- whether α is expressible in terms of nth roots of unity for some n -- is equivalent to the question of whether α is a member of some field of the form Q(ζ), where ζ is a primitive nth root of unity. Such a field is called a "cyclotomic field". However, this reformulation is somewhat tautological and doesn't really help to answer the basic question.

What Kronecker realized, and partially proved, was that an equivalent condition for α to be expressible in terms of nth roots of unity was this: the "splitting field" F of the "minimal polynomial" f(x) of α should be an "abelian extension" of Q.

What do those new terms mean? The "minimal polynomial" of α is the monic polynomial f(x)∈Q[x] of least degree such that f(α) = 0. The "splitting field" of f(x) is the smallest extension F⊇Q that contains all the roots of f(x) (including α itself). Finally, F is an "abelian extension" of Q if its "Galois group" G(F/Q) is an abelian group.

What is a Galois group? That's a longer story -- Galois theory, which is fascinating and not really all that difficult, but it takes a fair amount of explanation anyhow. We offer a fuller explanation elsewhere. But essentially a Galois group is a group of permutations of the roots of a polynomial f(x) that induce an automorphism of the splitting field of f(x). Although f(x) has as many roots as its degree, not all permutations of those roots necessarily induce an automorphism of the splitting field. The Galois group G(F/Q) thus encodes information about the structure of intermediate fields E such that Q⊆E⊆F.

What the Kronecker-Weber theorem does for us is give a more computable way of determining whether an algebraic number α can be expressed in terms of nth roots of unity. The procedure is roughly: find the minimal polynomial f(x) for α and determine whether the Galois group of the splitting field of f(x) is abelian.

The Kronecker-Weber theorem can also be stated simply as a theorem about fields: Every abelian extension of Q is contained in a cyclotomic field. This form is more convenient for exploring generalizations. In particular, if you take Q and adjoin all nth roots of unity for each n, the resulting field will be an abelian extension of Q which is the maximal abelian extension of Q, because it contains all other abelian extensions.

The Jugendtraum

Now, one can imagine various different ways to generalize the theorem. Such generalizations are part of the rather difficult subject known as "class field theory". (Indeed, the Kronecker-Weber theorem, which was originally a major effort to prove, is a simple consequence of more general class field theory.) There was, however, one generalization that Kronecker perceived to be within reach, on the basis of his vast expertise in algebraic number theory. Evidently he intuited it even as a young man, because he called it his Jugendtraum -- a "dream of youth".

The generalization involves considering abelian extensions of a field other than Q. Specifically, are there other fields F for which one can say that all abelian extensions of F must be contained in some relatively simple and easily described type of field? The answer embodied in the Jugendtraum is yes -- if the field F is an imaginary quadratic extension of Q -- in which case all abelian extensions of F will be contained in an extension of Q generated by certain values of elliptic functions.

This is where we return to the theory of elliptic curves. We shall need to consider a class of elliptic curves which have a very special property called "complex multiplication". Suppose in this section that E is an elliptic curve that is isomorphic to the complex torus C/L, where L has basis ω1 and ω2 and τ = ω12.

For any n∈Z, note that nL⊆L. Unless n=±1, clearly nL≠L. In fact, for any real number t≠±1, tL≠L. Our previous discussion of homothety raises the question of whether there are any lattices L that are homothetic to themselves, with (nonreal) λL=L for some λ∈C. More generally, under what conditions is λL⊆L if λ is not an integer?

This can be answered with a simple computation. Assume λ∉Z. If λL⊆L, then there are integers a, b, c, d such that

λω1 = aω1 + bω2    and    λω2 = cω1 + dω2
and so
λτ = aτ + b    and    λ = cτ + d
Multiplying the second equation by τ and subtracting the first equation gives
2 + (d-a)τ - b = 0
We know c≠0, for otherwise λ=d, contrary to the assumption λ∉Z. And so τ must be an element of the field Q(√D), where D = (d-a)2 + 4bc, by the quadratic formula. τ can't be real, so we must have D<0, and τ must be a quadratic imaginary number.

The equation λ = cτ + d says that λ is also a quadratic imaginary number in Q(√D), and in fact similar manipulations show that λ satisfies

λ2 - (a+d)λ + ad - bc = 0
So λ is actually an algebraic integer in Q(√D).

Whenever we have the situation λL⊆L with λ∉Z, we say that the elliptic curve corresponding to C/L has "complex multiplication". In such a case, the isomorphism φ: E → C/L can be used to define an analytic map f: E → E of the curve to itself by the rule f(z) = φ-1(λφ(z)). Such a map is called an "endomorphism" of E. Endomorphism maps are heavily used in the theory of elliptic curves and generalizations.

In this terminology, we've shown that if E has complex multiplication by λ∉Z, then both λ and the period ratio τ lie in a quadratic imaginary field Q(√D), and λ is actually an integer of the field.

What about a converse? Suppose Q(√D) is a quadratic imaginary field. Then for any τ∈Q(√D) but τ∉Q, aτ2 + bτ + c = 0 with integers a, b, c, and a≠0. It follows that (aτ)τ = -bτ - c. Let λ = aτ. Then for any lattice L with basis ω1 and ω2 and τ = ω12,

λω1 = -bω1 -cω2    and    λω2 = aω1
and so λL⊆L. Hence if E corresponds to C/L, it has complex multiplication by λ.

The net result is that imaginary quadratic fields and elliptic curves have a "special" kind of relationship. Kronecker noticed this, and believed that he could generalize his earlier theorem about abelian extensions of Q to (a conjecture about) abelian extensions of imaginary quadratic fields. In this generalization he used special values of elliptic functions instead of special values of the exponential function. Kronecker made this conjecture in 1860, but he couldn't fully prove it. Weber completed the proof in 1891.


Open questions

Various questions about elliptic curves which are still open have been mentioned above. We'll summarize them here, and we'll mention some other fascinating questions there hasn't been time to cover in the same detail.

Existence of rational points

Perhaps the simplest question of all is still open: Is there a way to determine whether a given elliptic curve defined by a polynomial with rational coefficients has any rational points at all? More specifically, is there any algorithm for determining in a finite number of steps whether a given cubic curve has a rational point?

For quadratic curves (conics) there is a theorem that goes back to Legendre for effectively deciding whether a rational conic has a rational point. This has been given a much more elegant form by Hasse which states that the question can be answered by (relatively easy) tests for solutions in R and modulo p for all primes p -- the "Hasse principle". There are counterexamples to show that this doesn't work for cubic curves.

Mordell's theorem says that the group of rational points E(Q) of an elliptic curve over Q is finitely generated. (Though the group might be trivial, consisting only of the identity, a single "point at infinity".) The group therefore is the product of a finite group E(Q)t (the torsion subgroup, of points having finite order) and 0 or more infinite cyclic groups (copies of Z). The rank is defined as the number of copies of Z, a finite number.

The Nagell-Lutz theorem shows that there is an effective algorithm to compute the rational points of finite order in E(Q), so it can be effectively determined whether such rational points exist. But there may be no nontrivial points of finite order, and so the hard problem, for which a solution is not known, is how to determine effectively whether a curve has nonzero rank.

The rank of elliptic curves

There is no effective algorithm known for computing the rank. Techniques used to prove Mordell's theorem often work in particular cases, but there's no proof they work in all cases.

Not much is known theoretically about the rank either. The conjecture of Birch and Swinnerton-Dyer is the main idea of interest in that direction. Very little is know about what values are possible for the rank of an elliptic curve. Special cases which have been computed are all very small. It is not known which integers can be the rank of an elliptic curve or even whether the rank can be arbitrarily large.

As of 2000, elliptic curves have been discovered whose rank must be at least 24, but not for any larger number.

The Birch and Swinnerton-Dyer conjecture

In one of its simpler forms the Birch and Swinnerton-Dyer conjecture states that the rank of an elliptic curve E is the order to which the L-function L(E,s) vanishes at s=1. The order is the smallest number n such that lims→1 L(E,s)/(s-1)n is nonzero. The conjecture was conceived from the observation of a number of special cases, and in its simplest form merely stated that an elliptic curve has infinitely many rational points if and only if L(E,1) = 0.

The conjecture has been partially verified. Namely, it has been proven that L(E,1) ≠ 0 implies E has rank 0, so E(Q) is finite. It has also been shown that if L(E,s) has a zero of order 1 at s=1, then the rank of E is 1. From these facts it follows that if E(Q) is finite, then L(E,s) can't have a zero of order 1 at s=1, but for all we know it could have a higher order zero. Also, if E has rank 1, then L(E,s)=0, but the order of the zero could be higher than 1.

Although elliptic curves are known that must have rank at least 24, no elliptic curve has yet been found that has a zero of order higher than 3 at s=1. Curves for which L(E,s) is proven to have zeros of order 1, 2, or 3 are known, and (needless to say?) in these cases the actual rank of the curve is consistent with the conjecture.

The sharpest form of the conjecture actually gives a formula for the value of lims→1 L(E,s)/(s-1)r if the rank of E(Q) is r (i. e., the coefficient of the term involving (s-1)r in the Taylor series expansion of L(E,s)). All the terms in this formula are fairly well understood, except for one, which is thought to be the order of a finite group, the Tate-Shafarevich group Ш(E/Q).

The Shafarevich conjecture

Unfortunately, the Tate-Shafarevich group is not even known to be finite. The Shafarevich conjecture says, simply, that it is. Of course, this is a necessary condition for the strongest form of the conjecture of Birch and Swinnerton-Dyer even to make sense.

The precise definition of Ш(E/Q) is rather technical. But it may be considered to be a kind of a measure of the degree to which the Hasse principle fails to be true. This principle says that the nature of the group of rational points E(Q) should essentially be determined by the nature of the points E(Fp) of the curve over the finite fields Fp for all primes p.

A few facts are known about Ш(E/Q). For example, it is known to be finite if L(E,s) has a zero of order at most 1 at s=1. However, although no examples are known where Ш(E/Q) isn't finite, neither are any examples known where it is if L(E,s) has a zero of order more than 1 at s=1.

Truth of the Shafarevich conjecture is a necessary but not sufficient condition for the strong form of the Birch and Swinnerton-Dyer conjecture. Its truth, however, is sufficient for a weaker conjecture, known as the parity conjecture. There is a number wE which occurs in the functional equation of L(E,s) and has the value ±1. It follows from that equation that the order of the zero of L(E,s) at s=1 is even or odd according as wE is +1 or -1. The parity conjecture say that the rank of E is even or odd in the same way.

Beilinson's conjectures

p-adic analogues

Moonshine

The Langlands program

The generalized Riemann hypothesis



Recommended references: Web sites

Site indexes

Open Directory Project: Elliptic Curves and Modular Forms
Categorized and annotated number theory links. A version of this list is at Google, with entries sorted in "page rank" order.
Galaxy: Elliptic Curves and Modular Forms
Categorized site directory. Entries usually include descriptive annotations.


Elliptic curves

Elliptic Curves
Many links to pages about elliptic curves in general and with respect to algorithms, cryptography, modular forms, L-functions, etc. By Stéfane Fermigier.
Some Interesting References on Elliptic Curves
Reference information includes mathematicians who have worked on elliptic curves, bibliographies, software, and external links. By Marc Joye.
Curving Beyond Fermat
November 1999 article by Ivars Peterson on the recent complete proof of the Taniyama-Shimura conjecture.
Elliptic Curves and Right Triangles
Very nicely done set of slides by Karl Rubin providing an elementary introduction to elliptic curves and their relation to an area problem for right triangles.
Elliptic Curves
Course notes from an introductory overview, available in DVI, PS, and PDF formats. By J. S. Milne.
MA 426 Elliptic curves
Materials for a course at the University of Warwick (UK) by Miles Reid. Includes a detailed syllabus. Most items are in Postscript format.
Basic algorithms on elliptic curves
This is a directory containing Postscript and DVI files. By Horst Zimmer.
The Arithmetic of Elliptic Curves and Diophantine Equations
Good technical expository paper by Loïc Merel. Discusses application of elliptic curve theory to various Diophantine equations. (In PDF format.)
On computing the rank of elliptic curves
An undergraduate thesis by Jeff Achter in Postscript format.


Modular functions and modular forms

Modular Functions and Modular Forms
Course notes from an introduction to the arithmetic theory of modular functions and modular forms, available in DVI, PS, and PDF formats. By J. S. Milne.
Bibliography for automorphic and modular forms, L-functions, and representation theory
Extensive list of papers and books, by Paul Garrett. See also his Vignettes on automorphic forms, representations, L-functions, and number theory for a number of downloadable articles.
Modular forms
Lecture notes by Igor Dolgachev, PDF format.
William A. Stein's Homepage
Has a variety of resources for both modular forms and elliptic curves, such as papers, informal talks, a database of modular forms (computations), and some external links.


The Birch and Swinnerton-Dyer conjecture

The Birch and Swinnerton-Dyer Conjecture
Brief description of the problem at the Clay Mathematics Institute site by Andrew Wiles. (A more complete description is available as a PDF file.)
Birch and Swinnerton-Dyer conjecture
Article from Wikipedia.
Birch and Swinnerton-Dyer conjecture
Very brief article from PlanetMath.Org.


Recommended references: Magazine/journal articles

Ranks of Elliptic Curves
Karl Rubin; Alice Silverberg
Bulletin of the AMS, October 2002, pp. 455-474
The "rank" of an elliptic curve measures the size of its set of rational points. This survey discusses the Birch and Swinnerton-Dyer conjecture, the parity conjecture, and ways to find curves with large rank.
[Abstract, references, downloadable text]
A Proof of the Full Shimura-Taniyama-Weil Conjecture Is Announced
Henri Darmon
Notices of the AMS, December 1999, pp. 1397-1401
Christophe Breuil, Brian Conrad, Fred Diamond, and Richard Taylor have announced a proof of the full conjecture for elliptic curves over the rationals. The establishment of this as a theorem is extremely important for the theory of elliptic curves and for the Langlands program in general.
[Article in PDF format]
Galois representations and modular forms
Kenneth A. Ribet
Bulletin of the AMS, October 1995, pp. 375-402
Discusses material which is related to the recent proof of Fermat's Last Theorem: elliptic curves, modular forms, Galois representations and their deformations, Frey's construction, and the conjectures of Serre and of Weil. (Ribet proved an important result that led to Wiles' proof.)
[Article in downloadable formats]
A Report on Wiles' Cambridge Lecturess
K. Rubin; A. Silverberg
Bulletin of the AMS, July 1994, pp. 15-38
Central to Andrew Wiles' proof of Fermat's Last Theorem was a proof of a special case of the Taniyama-Shimura conjecture that elliptic curves over Q are modular. Some of the mathematics involved in that proof is explained.
On the Passage from Local to Global in Number Theory
B. Mazur
Bulletin of the AMS, July 1993, pp. 14-50
The Tate-Shafarevich group is defined for abelian varieties over Q, of which elliptic curves are an important special case. A long-standing conjecture is that such groups are finite. The conjecture has been proved for some elliptic curves over number fields. This has implications for the relationship between local results (at individual primes) and global results in number theory, and other problems, such as the Birch-Swinnerton-Dyer conjecture.
L-Series of Elliptic Curves, the Birch-Swinnerton-Dyer Conjecture, and the Class Number Problem of Gauss
D. Zagier
Notices of the AMS, November 1984, pp. 739-743
Poincaré realized that the solvability of Diophantine equations of the form f(x,y) = 0 depends on the topology of the surface f(x,y)=0 as a curve with complex-valued coordinates, where f(x,y) is a polynomial. It turns out that when f(x,y) is a cubic defining an elliptic curve, there are surprising connections to Gauss' class number problem for binary quadratic forms.


Recommended references: Books

The subject of elliptic curves is, obviously, both very extensive and very technical. As such, there are currently no books on the subject which are easily accessible to the general reader. The following books present a number of good choices for those who have at least some background in college-level mathematics.
Henry McKean; Victor Moll -- Elliptic Curves: Function Theory, Geometry, Arithmetic
Cambridge University Press, 1997
The authors offer a refreshingly clean and efficient introduction to the subject, with the right level of detail but little inessential abstraction. Various applications of the theory are included, such as a chapter on imaginary quadratic number fields and Kronecker's Jugendtraum.
Joseph H. Silverman -- Advanced Topics in the Arithmetic of Elliptic Curves
Springer-Verlag, 1994
An excellent sequel to the author's first volume on elliptic curves, but it requires the first volume or some other intoductory material as a prerequisite. Topics covered include modular forms and complex multiplication. The treatment is generally abstract, with much use of ideas from algebraic geometry.
Anthony W. Knapp -- Elliptic Curves
Princeton University Press, 1992
This is one of the best general introductions to elliptic curves. It is well-organized, develops the subject from a relatively elementary starting point, yet covers a great deal of material. It's also good for understanding the role that elliptic curves play in the Langlands program.
Joseph H. Silverman; John Tate -- Rational Points on Elliptic Curves
Springer-Verlag, 1992
This book is relatively brief, aimed specifically at an undergraduate audience, and based (not surprisingly) on Tate's popular 1961 lectures on the subject, with additions by Silverman. Definitely a good way to get into the subject.
J. W. S. Cassels -- Lectures on Elliptic Curves
Cambridge University Press, 1991
As a short lecture notes volume, this book presents a useful broad outline of the subject, but has to skip many details. Important topics that are covered include Galois cohomology, the Tate-Shafarevich group, and factorization using elliptic curves.
Dale Husemoller -- Elliptic Curves
Springer-Verlag, 1987
Excellent textbook-style exposition of the basic theory of the algebra, arithmetic, and analysis of elliptic curves. It is based, in part, on celebrated lectures by John Tate on the subject.
Joseph H. Silverman -- The Arithmetic of Elliptic Curves
Springer-Verlag, 1986
This volume is essentially the first of a two-volume sequence. As such, it gives an fine treatment of many topics connected with elliptic curves, with special emphasis on arithmetic and computation. But it stops short of much discussion of modular forms and other more "advanced" topics.
K. Chandrasekharan -- Elliptic Functions
Springer-Verlag, 1985
The subject here is elliptic and modular functions, with special emphasis on specific functions, but nothing on elliptic curves. The book does cover interesting number theoretic applications, such as quadratic reciprocity and the representations of numbers by sums of squares and by quadradic forms.
Neal Koblitz -- Introduction to Elliptic Curves and Modular Forms
Springer-Verlag, 1984
One of the better choices for an introduction to the subject. It is relatively brief and emphasizes the simpler cases rather than the greatest generality. The "congruent number problem" is used as motivation for much of the presentation.
Jean-Pierre Serre -- A Course in Arithmetic
Springer-Verlag, 1973
Serre is a master of the subject, as well as a much of modern mathematics. This presentation is very elegant and very terse. In a short space it also covers a great deal of number theory in addition to elliptic curves and modular forms.

Home

Copyright © 2002 by Charles Daney, All Rights Reserved