Prerequisites: Algebraic number theory -- The Riemann hypothesis
See also: Number theory -- Algebraic geometry -- The Langlands program
In the case of elliptic curves, the commerce is in such things as ideas, concepts, problems, and theorems rather than raw materials and finished products, but the pattern is the same. An impressive number of ideas and problems from other parts of mathematics turn up when considering even seemingly simple questions about elliptic curves. And many techniques, methods, and results arising out of the study of elliptic curves have been generalized, extended, and exported to diverse other areas of mathematics, sometimes with astonishing consequences.
The famous Last Theorem of Fermat, for example, was finally proven as a mere corollary to a deep, difficult, but beautiful theorem about elliptic curves.
Elliptic curves can be thought of in many different ways, but perhaps the simplest and most intuitive is in terms of plane curves. Everyone is familiar with plane curves such as ellipses and parablolas. These can be defined using a polynomial equation in two unknowns of the form F(x,y) = 0, where all terms in the equation have degree two or less. Such curves are nothing but the conic sections (including also circles and hyperbolas) that were extensively studied by the ancient Greeks.
If you take the same kind of equation and allow terms of degree three or less (x^{3} or x^{2}y, for example), then the resulting curves are called elliptic curves. The name is confusing, as ellipses themselves are conic sections, not elliptic curves. The latter are only indirectly related to ellipses, because they occur implicitly in formulas for the arc length of an ellipse. Nevertheless, the nomenclature is firmly established, and we're stuck with it.
Elliptic curves are far more interesting than conic sections, as will be apparent from the discussion here. Historically, they have played an important role in such diverse mathematical subjects as number theory, complex analysis and Riemann surface theory, and algebraic geometry. Here are some of the active areas of mathematics to which the theory of elliptic curves has substantial connections:
Of course, all of this inevitably sounds rather vague and nebulous unless and until you know what an elliptic curve is in the first place. So, what is it?
Nevertheless, in one sense, an elliptic curve is a type of curve that is just one step more complicated than an ellipse. The simplest of all "curves" are straight lines, each of which is the locus of points (i. e., x-y pairs) where x and y are related by an equation like this:
y = Ax + B(For the immediate purposes here of talking about equations, capital letters will denote constants.)
The next step up in complexity would involve equations like this:
y = Ax^{2} + Bx + Cwhere the right hand side has a quadratic or "second-degree" polynomial in x. The curve which is the locus of points satisfying such an equation is a parabola. We could continue to use polynomials of higher degree on the right to produce cubic curves, quartic curves, etc. But in some sense these are not really more complicated than a parabola.
To get something that might be more complicated, we can consider equations where y also occurs as a square, such as
y^{2} = Ax^{2} + Bx + CThis equation represents an ellipse. It would still be an ellipse if there were also a first-degree term in y (because that could be eliminated by suitably "rotating" the coordinates with a change of variables). As you probably recall, both parabolas and ellipses are examples of curves known as "conic sections", because they can be obtained by cutting a cone in a suitable way. If you cut the cone parallel to a side, you get a parabola. If you cut it so as to obtain a closed curve for the cut, you get an ellipse (a special case of which is a circle, if the cut is perpendicular to the axis of the cone). Any other sort of cut yields another conic section called a hyperbola. Conic sections were studied by the Greeks -- which was a respectable achievement, since they had none of our algebraic techniques for representing curves.
Once y occurs to degree 2 in the equation of a curve, very interesting things start to happen as you raise the degree of the polynomial in x on the right side of the equation. Now something definitely new appears when the polynomial in x has degree 3 or more. In particular...
An elliptic curve (in one, narrow, sense) is the locus of points in the x-y plane that satisfy an algebraic equation of the form
y^{2} = Ax^{3} + Bx^{2} + Cx + D(with an additional technical condition to avoid "degenerate" cases -- specifically, the roots of the polynomial on the right should be distinct). Such equations would seem to be somewhat less general in form than a polynomial identity such as F(x,y) = 0 where mixed terms like x^{2}y occur. However, in an important sense, any curve defined by such an equation is equivalent to one defined by the special form given above, in fact to a curve with an even more restricted form (as explained later).
We'll see soon enough that an elliptic curve is much more interesting and complicated than any of the conic sections, but it's not easy to explain in an elementary way just why this is the case. It has to do, or course, with the fact that y appears in the equation to the second degree. The variable y is still in some sense a function of x, where y can be called the "dependent" variable and x the "independent" variable. And yet obviously the relationship is more complicated, because in order to get a value of y from any given value of x one has to take a square root. This introduces more complexity because taking square roots isn't a single-valued operation -- there are two square roots (one positive, one negative) of any positive number. This fact results in far more complexity than you might suppose. Dealing with such multiple-valued functional relationships leads directly to rather deep mathematics -- the theory of "Riemann surfaces".
Mathematicians say that in such a circumstance where variables x and y are related by a polynomial equation, in which both appear to degrees higher than 1, that y is an "algebraic" function of x (and x is an algebraic function of y). Algebraic functions began to be studied rigorously in the 19th century, and quickly yielded some very beautiful mathematics, including the theory of Riemann surfaces -- and elliptic curves.
Finding "all" solutions of polynomial equations of this kind is not in the least a trivial matter when higher degree polynomials are involved. The theory of how to find, or at least describe, such solutions, where one or more polynomial equations are involved, has grown into the large and extremely complicated area of modern mathematics known as "algebraic geometry". The theory is so-named because it generalizes and applies sophisticated algebraic techniques to the study of geometric curves and surfaces such as conic sections. One reason the theory of elliptic curves is so appealing is that such curves represent the simplest case which is still difficult enough (in spades!) to be really interesting.
So let's get back to the story at hand.
There is another level of subtlety here. So far we have been deliberately vague as to what sort of values x and y represent. In the simplest case, they are elements of the field of real numbers, denoted by R. This is what one learns to deal with in high school when graphs of curves (such as conic sections) in the Cartesian x-y plane are discussed. In this case, the coefficients of the equation (A, B, C, D) are assumed to be real numbers as well.
However, the field R is only one of a number of fields we could consider in which the equation of an elliptic curve makes sense. Abstractly, a field is defined as a set of objects which has two related group structures. Intuitively, it is fine to think of these structures as the operations of addition and multiplication which are familiar in the field R.
In a little more detail, to say that an operation corresponds to a group structure is to say that it satisfies a few axioms. Using "+" to denote the operation of "addition" and "×" to represent the operation of "multiplication", the axioms require that the operations be associative, i. e. A+(B+C) = (A+B)+C, and A×(B×C) = (A×B)×C. There must be an identity element. For the + operation in a field, the identity is usually symbolized by 0: A+0 = 0+A = A. The identity for the × operation is symbolized by 1: A×1 = 1×A = A. Except for the additive identity with respect to multiplication, every element has an inverse for both operations. "-A" is the additive inverse of A: A + (-A) = A - A = 0. "1/A" or "A^{-1}" is the multiplicative inverse: A×A^{-1} = A/A = 1. (If A = 0, its multiplicative inverse is not defined, but 0 × A = 0 for any A.) In a field, both operations should be "commutative": A + B = B + A and A × B = B × A. (I. e., the groups are "Abelian".) And finally, it is required that × be "distributive" with respect to +: A×(B+C) = A×B+A×C. A field, then, is just a system of objects (that may be thought of as "numbers") which satisfy the familiar laws of arithmetic.
Besides the field R of real numbers, there are several other fields which are important in general, and for the theory of of elliptic curves in particular:
y^{2} = f(x) = Ax^{3} + Bx^{2} + Cx + Dwithout specifying a field K which is to contain the solutions of that equation. This field K is called the "field of definition of the curve". "Points" on the curve are then pairs of numbers (x,y) with both x and y members of K. In addition, it is assumed that the coefficients A, B, C, D are also members of K. Symbolically, f(x) ∈ K[x], where K[x] is the set of polynomials having coefficients in K.
Nevertheless, it is often convenient to leave the field of definition K unspecified. This is because when the defining polynomial is in K[x], it is also in K′[x] for fields K′ with K ⊆ K′. And if there is a solution (x,y) with x and y ∈ K (or symbolically, (x,y) ∈ K×K), then (x,y) ∈ K′×K′ also. So a point (x,y) on a curve defined over K is on the curve defined over K′ ⊇ K as well. Many things that can be said about a specific curve really depend only on the equation and not on the field of definition. Furthermore, to study the properties of "the" curve defined by a particular equation, it is often useful to study the solutions of the equation in each of the fields mentioned above.
As a matter of notation, an elliptic curve corresponding to a particular equation is often denoted simply as "E". When we are concerned with points on the curve that have coordinates in a specfic field K, we write E(K) for that set of points.
In some sense, C is the most "natural" field over which to define an elliptic curve because it has the property of being "algebraically closed". This means that every polynomial f(x) ∈ C[x] has the maximum number of possible roots (which is the same as the "degree" of the polynomial) lying in C rather than in some extension of C. Therefore, the resulting curve, E(C), which is a set of pairs (x,y) of numbers of the field, is as complete as possible.
However, some of the most interesting questions about an elliptic curve arise from considering the curve as defined over a smaller field, especially R or Q. In particular, the points E(R) can easily be graphed on a Cartesian coordinate system when the curve is considered to be defined over R. And some of the most interesting mathematics results from trying to determine the set of "rational points on the curve", i. e. E(Q), viewing the curve as being defined over Q.
So an elliptic curve is an object that is easily definable with simple high school algebra. Its amazing fruitfulness as an object of investigation may well depend on this simplicity, which makes possible the study of a number of much more sophisticated mathematical objects that can be defined in terms of elliptic curves.
We should note that there are other ways to define the notion of "elliptic curve". They can be proven to be (more or less) equivalent to the one we have used. In order to even give such definitions, however, it's necessary to use concepts and terminology from algebraic geometry. There one first defines the notion of "curve" in general (as a "variety" of "dimension" one). Next, one appeals to notions of topology to define the concept of "genus" -- roughly, the number of "holes" in the curve when considered as a Riemann surface. Finally, one defines an elliptic curve as a curve with genus = 1. Given this, it can be shown, that curves of this sort correspond to a cubic equation such as we used, and conversely. This is interesting to know if/when you get into algebraic geometry, but requires dealing with more unfamiliar concepts than necessary to begin with.
Saying that an equation is "Diophantine" is not actually saying something about the equation, but rather about the type of solution one is looking for. With a Diophantine equation (or system of equations), what is sought isn't the set of all solutions in real or complex numbers, but rather the set of solutions in rational numbers or integers. Mathematicians have been interested in this special case for over 2000 years. The term "Diophantine" itself goes back to Diophantus of Alexandria, who was the leading Greek mathematician of his time, about 250 CE, and who contributed much to the study of equations that came to be named after him. Diophantus' writings were lost for over 1000 years, but when they were rediscovered in 1570, it was found that he had originated the concept of negative numbers and techniques for solving algebraic equations in general.
However, the study of Diophantine equations didn't begin with Diophantus. For example, the sides of a right triangle satisfy the Pythagorean equation: A^{2} + B^{2} = C^{2}. This was known to the Egyptians, and to the Babylonians before that. It was useful to them to know of integer solutions to this equation -- triples of integers (A,B,C) -- because they could be used to mark off three sections of a long cord, and this in turn could be used to construct accurate right angles for a building. It was discovered that there are infinitely many such possible triples (which became known as Pythagorean triples), and new solutions could be constructed from known solutions.
There is another geometric problem which was considered in antiquity, and in this case it leads to a Diophantine equation which is cubic and defines an elliptic curve. The problem is to determine, for an integer n, whether there are any right triangles that have an area equal to n and sides which are rational numbers -- and if there are such triangles, to determine all of them. It turns out that this leads to this Diophantine equation:
y^{2} = x^{3} + n^{2}xAn integer n is said to be a "congruent" number if and only if a right triangle with rational sides and area n exists. It is known that certain integers are congruent (5, 6, 7, for example), while others are not (1, 2, 3, and 4). But a complete solution to the problem of giving necessary and sufficient conditions for n to be a congruent number is still an open question. It is known that n is a congruent number if and only if there are infinitely many rational solutions to the indicated equation, which amounts to the corresponding elliptic curve having infinitely many rational points. But deciding this last question is so difficult that it is still open, and it is intimately connected with a very sophisticated conjecture known as the Birch and Swinnerton-Dyer conjecture. We'll eventually explain that conjecture in more detail.
Just as with the Pythagorean equation, if our cubic equation has any rational solution at all, it will have infinitely many. This is because, given one solution, there is a procedure for generating another, and then a third, and so on. What this procedure boils down to is a way of "adding" two rational points lying on the curve to obtain another. Sometimes this procedure will terminate after a finite number of steps, because it yields a solution which has already been found. Other times the procedure can go on to generate infinitely many different solutions.
Carl G. J. Jacobi (1804-51) was the first to recognize that this procedure for "adding" two points amounts to specifying a group structure on any elliptic curve. The existence of this group structure on the set E(K) of points of the curve is one of the most important facts about elliptic curves. It makes the theory amazingly rich. This group structure is just a way of "adding" two points on the curve to produce a third point that is also on the curve, and to do this in such a way that the standard group axioms are satisfied. The term "addition" is reasonable for this operation, because it is commutative. And all this is true more or less independently of what field K the curve is defined over. (There's a minor exception in the case of fields that have "characteristic 2", i. e. fields where x+x = 0 for all x ∈ K.)
There are several different ways to define this group structure. (The fact that there are almost always several ways to look at the same thing in the theory of elliptic curves is one of the most intriguing, or most confusing, things about it, depending on your point of view.)
The approach to defining the group structure which is most transparent is to regard the curve as a geometric plane curve in the Cartesian plane R^{2}. It is a simple fact that in this case a straight line intersects a cubic curve in either one point or three -- this is where the fact that the curve is cubic in x is crucial. (The proof is to substitute the equation of a line, y = mx + b, into the equation of the curve to yield a third degree polynomial in x alone, which has either one or three real roots.) So suppose you have any two distinct points P and Q on the curve. A line through those two points must then intersect the curve at a third point, say R = (x_{R},y_{R}). R itself is not the sum of P and Q, but instead R′ = (x_{R},-y_{R}) is. (If any point (x,y) ∈ E(R), then (x,-y) ∈ E(R) also.) This choice for the sum of points makes the set E(R) into a group under the + operation.
How does this work if you want to add a point to itself to get P + P? In that case, you take the tangent to the curve at P. This will also intersect the curve at another point. Here it is important that the curve have a well-defined tangent at every point. This is guaranteed if f(x) in the equation y^{2} = f(x) has no repeated roots, in which case the curve is said to be "non-singular" -- and this is one of the requirements for a curve to be elliptic curve.
Perhaps you can see one other difficulty with this definition if you draw a few elliptic curves. If P = (x,y) and Q = (x,-y) are on the curve, then the line joining them will be vertical. This will not appear to intersect the curve at any other point. In this case the point of intersection is defined as the "point at infinity", designated by O. This may seem like cheating, but it can be handled rigorously by using the notion of "projective coordinates" and working in the "projective plane" instead of R^{2}. Indeed, working with a projective plane instead of the "ordinary" plane (sometimes called the "affine" plane) is standard operating procedure in algebraic geometry, because it automatically handles all the awkward special cases that have to be considered in finding where two curves intersect. The precise definition of projective coordinates isn't difficult, but we won't go into it, in order not to lengthen this discussion further.
The point at infinity O is not just some ugly kluge. It is a key ingredient in making E(R) into a group, because it turns out to be the identity element. In other words, P + O = P for all points P on the curve. (This equation is the reason O is used to stand for the point at infinity.) One can then define an inverse operation so that -P is the point Q such that P + Q = O. (Q is the unique point where a vertical line through P intersects the curve, i. e. the reflection of P across the x-axis.) One sets -O = O, of course.
All of these definitions would be meaningless, of course, unless the group axions are satisfied by the + operation. Verification of the axioms can be done using straightforward but messy algebra. Such algebra yields an expression for the coordinates of P + Q in terms of the coordinates of P and Q. This expression could have been used from the beginning as an alternative definition of the + operation, but no one would have guessed it.
However, the explicit formula for the coordinates involves only rational operations (addition, subtraction, multiplication, division -- no extraction of roots). A very important consequence is that if the points P and Q have rational coordinates, then so does P + Q. Hence + also provides a group structure for the set E(Q) of rational points on the curve. An additional consequence is that the group operation in fact makes sense over almost any field, so that E(K) has a group structure when K is, for instance, a finite field. We'll see why the finite fields are important a little later.
Let's now return to the discussion of Diophantine equations. Suppose we have a polynomial equation in two variables with rational coefficients, such as F(x,y) = 0. There are four questions we might ask about it:
For equations of degree 3 (elliptic curves, basically), the first results were found by Axel Thue in 1908, that certain equations can have only a finite number of integer solutions. Thue used a technique called "Diophantine approximation", and his results were generalized in the 1920s by Carl L. Siegel to show that all third degree equations have at most a finite number of integer solutions. For equations of degree greater than 3, Louis J. Mordell made a famous conjecture in 1922 that there could likewise be at most a finite number of integer solutions. This case was much harder, though, and it wasn't proven until 1983, by Gerd Faltings.
Question 1, about the existence of any integer solutions is extremely hard, and is still an open question in general.
What about rational solutions? This question is even more difficult in general. If the degree of the equation is higher than three, little is known. If the degree is exactly three, we have essentially an elliptic curve, and trying to answer questions 3 and 4 is presently where most of the theoretical action is. Mordell gave a good partial answer in 1923 (based on a conjecture of Henri Poincaré in 1901), known as Mordell's Theorem. This result states that the group E(Q) of rational points on an elliptic curve is "finitely generated". This means that, if there are any rational solutions, then they can all be determined from a certain finite subset of them.
Unfortunately, there are two things that Mordell's result does not do. First, it provides no way to tell whether any rational points exist (other than the "point at infinity"). Second, it does not provide an "effective" means (i. e. an algorithm) for finding a set of generators for the group of rational points. In some cases Mordell's methods are able to do this. And it has been conjectured, but not yet proven, that the methods will work in all cases.
There is a general theorem about finitely generated abelian groups such as E(Q). It states that any finitely generated abelian group is the "direct sum" of the subgroup consisting of elements of finite order and zero or more copies of the additive group Z of integers. Symbolically, for any such group G:
G ≅ G_{t} ⊕ Z^{r}In this formula, "≅" means "is isomorphic to". (Isomorphic groups have identical structure and are indistinguishable as groups.) G_{t} is the "torsion subgroup" of G that consists of all elements of G that have finite order (i. e. for such an element g ∈ G, there is an integer n such that ng = 0). Z^{r} is the infinite group that is the direct sum of r copies of the integers. You can think of this group as consisting of r-tuples of integers with addition being defined in the obvious way (componentwise).
The number r is called the "rank" of the group. If G = E(Q) is the group of rational points of an elliptic curve, r is called the rank of the curve. Determining r theoretically and in practice is currently the main problem of arithmetic elliptic curve theory. No effective method is known for determining whether a particular elliptic curve has an infinite number of rational solutions (i. e. whether or not r > 0), and it isn't even known whether there are curves with arbitrarily large rank, though a bound isn't considered likely to exist. There isn't any good algorithm for calculating r in particular cases, either. The Birch and Swinnerton-Dyer conjecture is all about the number r.
As it happens, much more is known about the torsion part of the group E(Q), denoted by E(Q)_{t}. A theorem due to Elisabeth Lutz and Trygve Nagell in the 1930s showed how to compute E(Q)_{t} in any particular case. The theorem says that if E is the curve y^{2} = x^{3}+ax+b with a and b in Z, and if (x,y) ∈ E(Q)_{t} then x and y are integers and either y=0 or y^{2} divides 4a^{3}+27b^{2}. Hence there are only finitely many possible pairs (x,y) to test for membership in the torsion subgroup, and each test can be done in a finite number of steps.
In 1976 Barry Mazur proved that only 15 possible groups can occur as E(Q)_{t} for any elliptic curve. They are all very small groups, namely
Z/mZ for m = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12or
Z/mZ ⊕ Z/2Z for m = 2, 4, 6, 8Here Z/mZ denotes the cyclic group of order m, which is isomorphic to the set of integers modulo m under addition. (We explain this a little further in the next section.) Examples are known for which each of the possible groups actually occurs.
Suppose the curve is defined by the equation F(x,y) = 0, where F(x,y) is a cubic polynomial with rational coefficients. If m is any integer, F(x,y) = 0 if and only if mF(x,y) = 0, so the roots of the equation are unchanged if we multiply by a suitable integer to clear the denominators of all coefficients. So without loss of generality we may suppose that F(x,y) has integer coefficients. If there is a rational solution (x,y) ∈ Q^{2}, then obviously there is a solution in R^{2} as well.
A pervasive concept in number theory is the use of "modular" artithmetic. That is, one frequently considers the value of an integer m "modulo" a positive integer n. The value of m modulo n is simply the remainder of m on division by n. One writes m ≡ a (mod n) if a is the remainder, i. e. if m = bn + a for some integers a and b, where 0 ≤ a < n -- or more generally, if m - a is divisible by n. The set of possible remainders of integers modulo n is denoted by Z/nZ, and it has a group structure which results from performing addition in Z and then taking remainders modulo n. (In fact, it has a "ring" structure as well, if multiplication is handled the same way.)
Given that, if F(x,y) has integer coefficients, we can reduce them modulo n for any positive n ∈ Z, and any equation becomes an equivalence modulo n. Hence if F(x,y) = 0, then also F(x,y) ≡ 0 (mod n) for any n. Suppose that (a,b) is a rational solution of F(x,y) = 0. As long as p is a prime number that doesn't divide the denominator of either a or b, we can easily make a corresponding solution of F(x,y) ≡ 0 (mod p), and in fact of F(x,y) ≡ 0 (mod p^{m}) for any m > 0. So the existence of rational solutions of F(x,y) = 0 implies the existence of solutions in Z/p^{m}Z for most primes p and all m > 0.
Legendre's theorem alluded to previously says that for quadratic polynomials, these conditions are not only necessary but also sufficient for the existence of rational solutions of F(x,y) = 0. Unfortunately, this isn't the case for polynomials of degree higher than 2. There is no simple analogue of Legendre's theorem for cubic (elliptic) or higher degree curves. Nevertheless, in some sense it seems that there ought to be conditions involving solvability of an equation in R and suitable algebraic properties of the equation for each prime number p (and its powers) that guarantee a solution in Q. This somewhat vague idea is called the "Hasse principle". It is named after Helmut Hasse, whose name will come up frequently in further discussions of elliptic curves. Hasse, in fact, along with Hermann Minkowski proved a generalization of Legendre's theorem -- called the Hasse-Minkowski theorem -- that says a homogeneous (all terms have the same degree) quadratic polynomial in n variables has a solution in Q if and only if it does in R and in the "p-adic numbers" for all primes p. ("P-adic numbers" will be explained a little later.)
So we have, at least, a clue that we should examine solvability of the equation for an elliptic curve when the equation is reduced modulo p, for primes p. We have to be a little careful about what primes we work with. If the cubic equation is in the form
y^{2} = f(x) = x^{3} + Ax^{2} + Bx + C(which can always be arranged without significantly changing the curve), then it is important that the polynomial have distinct roots. When this is so, the curve is said to be "nonsingular". This is a technical condition which is necessary in order for the group structure to be defined. If the roots of f(x) are α_{1}, α_{2}, and α_{3}, then we can define a quantity
Δ = [(α_{1} - α_{2}) (α_{1} - α_{3}) (α_{2} - α_{3})]^{2}called the "discriminant" of f(x), and hence the discriminant of the curve. Δ will be an integer if the coefficients of f(x) are. Clearly, we will have distinct roots if and only if Δ ≠ 0. Therefore, we have to exclude primes p that divide Δ. The prime p = 2 is also a problem, and has to be excluded.
We are interested, then, in how many distinct points (x,y) satisfy the equation of the curve for x and y in Z/pZ. Now, these numbers modulo p actually make up a finite field which we earlier denoted by F_{p}, and this field has p elements. There are at most two values of y corresponding to each x. (Even in a finite field, an equation like y^{2} = C has at most two solutions.) Hence there are at most 2p ordinary solutions of the equation. Since it is also necessary to count one "point at infinity" as a solution, there can be at most 2p + 1 total solutions. Since the point at infinity is a solution, there is at least one solution. If N(p) stands for the number of solutions, then it satisfies
1 ≤ N(p) ≤ 2p + 1This double inequality can also be written as
|p + 1 - N(p)| ≤ pWhat this says is that N(p) ranges over values centered at p + 1 and no more than a distance of p from p + 1. Actually, N(p) must be a lot closer to p (for large p), and in fact
|p + 1 - N(p)| ≤ 2√pThis was conjectured by Emil Artin, and proven by Hasse in the 1930s. This says N(p) is "approximately" p + 1, and that makes sense, because in a finite field like F_{p}, half of the nonzero elements are perfect squares. (Because the elements of F_{p} except for 0 form a cyclic group under multiplication.) Therefore, one expects that about half of the values of f(x) are perfect squares as x ranges over F_{p}.
Guided by the Hasse principle, if for many prime p there are a relatively large number of solutions in F_{p}, then there should be many rational points on the curve, which would mean a large value for the rank r. In other words, we should expect that if N(p) is large (relative to p) for many p, then r is also large.
Of course, so far this is extremely imprecise, so we need a way to make something that we can actually compute with. For each prime p, N(p) is the piece of data that is of interest. However, the above inequalities show that we have a somewhat better understanding about how the numbers
a_{p} = p + 1 - N(p)behave for large p. These number represent how far N(p) is from its median value.
It turns out that if one defines a complex function of a complex variable as a certain infinite product, then the numbers a_{p} appear as coefficients in the Dirichlet series expansion of the function. (You may want to refer to the overview of the Riemann hypothesis for a much more detailed discussion of "Euler products" and "Dirichlet series".)
The function in question is this:
L(E,s) = ∏_{p∤&Delta(E)} (1 - a_{p}p^{-s} + p^{1-2s})^{-1} ∏_{p|Δ(E)} (1 - a_{p}p^{-s})^{-1}Here, Δ(E) is the discriminant of the curve E. The notation p|Δ(E) means that p divides the discriminant and p∤Δ(E) means p doesn't divide the discriminant. We recall that if p|Δ(E) then the curve is "singular" and the set E(F_{p}) doesn't have a group structure, although the number of its elements, N(p), is still well-defined, and hence so is a_{p}. The curve is said to have "bad reduction" for such primes, and "good reduction" for all other primes.
L(E,s) is called the "L-function" of the elliptic curve E (over the field Q). More specifically, it is called a Hasse L-function from its association with Hasse. Calling it an L-function suggests it has a lot in common with Dirichlet and Dedekind L-functions that occur in the theory of prime numbers of Q and prime ideals of finite extension fields of Q as presented in connection with the Riemann hypothesis. By definition, it does have an "Euler product". And indeed, L(E,s) has an analytic continuation for all complex s and satisfies a functional equation. But this is a very deep result, which has long been conjectured, yet only established within the last few years.
However, it can be shown fairly easily from the defnitions that L(E,s) has a Dirichlet series expansion:
L(E,s) = ∑_{n≥1} c_{n}/n^{s}whose coefficients satisfy c_{p} = a_{p} for prime p. This expansion is valid when the Euler product converges, which is for Re(s) > 3/2, by virtue of Hasse's bound on the size of a_{p}.
Z(E/K,T) = exp(∑_{1≤n<∞} |E(K_{n})|T^{n}/n)where by definition
exp(X) = ∑_{0≤n<∞} X^{n}/n!is itself a formal power series -- the "exponential". (The latter series is just the Taylor series of e^{X} from calculus.) What's going on here is simply that we substitute one formal series into another.
There is a famous set of conjectures made in 1949 by André Weil (another very important name in the theory of elliptic curves) that pertain when the same definitions are used with any "smooth projective variety" V instead of an elliptic curve. A smooth projective variety is simply a generalization of a curve that is defined by the solution set of a system of polynomial equations instead of just a single equation. Varieties are the basic objects studied in algebraic geometry.
The Weil conjectures are as follows:
Z(V/K,T/q^{n}) = ±q^{nχ/2} T^{χ} Z(V/K,T) where n and χ are the dimension and Euler characteristic of V
The dimension of a complex variety is its dimension as a complex manifold (having points with coordinates in the complex numbers C). A curve, in particular an elliptic curve, defined by a single polynomial equation in two variables therefore has dimension 1. The Euler characteristic of an elliptic curve is a topological invariant, which happens to be 0. Therefore, for elliptic curves, the Weil conjectures take the form:
ζ_{E/K}(s) = Z(E/K,q^{-s}) = (1 - aq^{-s} + q^{1-2s}) / [(1 - q^{-s})(1 - q^{1-s})]Then from the functional equation (3.) immediately above one has ζ_{E/K}(1-s) = ζ_{E/K}(s) -- which shows symmetry about the line Re(s) = 1/2 like Riemann's zeta function. Furthermore, from part (2.) above, we can conclude that if ζ_{E/K}(s) = 0, then |q^{s}| = √q, and hence Re(s) = 1/2.
But, for this zeta function, that is precisely what the Riemann hypothesis states -- all (nontrivial) zeros of the function lie on the line Re(s) = 1/2. Keep in mind, this is a proven result for the zeta function ζ_{E/K}(s) which we have defined as the zeta function of an elliptic curve. The similarity of relation (2.) to the hypothesis which is still unproven about the zeros of Riemann's original zeta function is why relation (2.) is also called a "Riemann hypothesis". It is also good evidence that the original Riemann hypotheses should likewise be true.
Note that the denominator of ζ_{E/K}(s) when q = p and K = F_{p} is a polynomial that is none other than the polynomial whose reciprocal appears in the product forumula for the L-function L(E,s) for primes p with "good reduction". This is the explanation for the seemingly arbitrary definition of L(E,s).
There's more. Without going into too many details, if you have a congruence mod p such as f(x_{1}, ..., x_{n}) ≡ 0 (mod p) then in addition to solutions in F_{p}, it is natural to consider solutions in extension fields F_{pn} as well. This amounts to working in the ring F_{p}[x_{1}, ..., x_{n}]/(f), and in that ring there are maximal ideals which correspond to prime ideals of number fields. There is also a notion of "norms", like the norms of number fields.
Given all that, you can define an Euler product that we'll call ζ(s) (which is not Riemann's zeta function):
ζ(s) = ∏_{P} (1 - (NP)^{-s})^{-1}where the product is over maximal ideals P, and NP is the norm of the maximal ideal P. Now, this norm is defined as the number of elements of the residue field of P. (A commutative ring modulo a maximal ideal is a field.) Since the residue field is a finite field of characteristic p, one can write NP = p^{deg(P)} for some integer that we call the degree of P, deg(P). In this notation,
ζ(s) = ∏_{P} (1 - p^{-s deg(P)})^{-1}Now make the substitution T = p^{-s} and take logarithms to get
log(∏_{P} (1 - T^{deg(P)})^{-1}) = ∑_{1≤n<∞} N_{n}T^{n}/nwhere N_{n} is the number of solutions of f(x_{1}, ..., x_{n}) = 0 with coordinates in F_{pn}. (This evaluation makes use of the relation of deg(P) to the number of solutions of the equation.)
Finally, if you take the formal exp() of that series, you get (in the elliptic curve case) that the zeta function Z(E/K,T) as originally defined has an Euler product of the form ζ(s) = ∏_{P} (1 - (NP)^{-s})^{-1}, which looks very much like the Dedekind zeta function of an algebraic number field.
To begin with, Weil proved a generalization of Mordell's theorem, namely that the group E(K) is finitely generated, when K is any finite extension of Q. Accordingly, the group E(K) is often called the Mordell-Weil group.
Weil also defined generalizations of the Hasse L-functions L(E,s). In order to spell out this generalization for an extension K ⊇ Q we would have to talk about prime ideals in the ring of integers of K. For precision, we would have to invoke the language of algebraic number theory, which talks about such things as "places" and "valuations". The details are messy, but the basic idea is much the same as when K = Q. These more general L-functions are, naturally, called Hasse-Weil L-functions.
Lastly, Weil generalized a conjecture of Hasse's which states that the L-function of an elliptic curve (over Q or a finite extension) has an analytic continuation and satisfies a functional equation which relates values of the L-function at s and 2-s. In fact, the same is conjectured to be true of the L-functions when they are "twisted" by a Dirichlet character χ. (If L(E,s) has the Dirichlet series ∑_{n} c_{n}/n^{s}, then its twist is L(E,s,χ) = ∑_{n} c_{n}χ(n)/n^{s}.) The extended conjecture, of course, is known as the Hasse-Weil conjecture.
This conjecture postulates exactly what one would hope to be true, namely that the L-functions of elliptic curves (and their twists) have the same very symmetric properties as the classical Dirichlet L-functions, namely an analytic continuation and a functional equation. (One dare not ask at this point that they satisfy a Riemann hypothesis as well, which isn't established yet even for Riemann's zeta function.)
Until recently, the Hasse-Weil conjecture was known to be true in only two cases: when the elliptic curve had the property known as "complex multiplication", and when the elliptic curve had the property of being "modular". However, there was yet another conjecture to which Weil's name was associated -- the Shimura-Taniyama-Weil conjecture -- which held that all elliptic curves are modular. We'll say more about that later, but to give away the secret ahead of time, it is now known that this last conjecure is true -- so the Hasse-Weil conjecture is true as well.
Right away there is a problem. The Euler product of L(E,s) converges only for Re(s) > 3/2. It certainly does not converge at s=1. That doesn't mean the function L(E,s) is meaningless at s=1, however. Indeed, the Hasse-Weil conjecture already proposed that L(E,s) has an analytic continuation for all s ∈ C. This conjecture is now known to be true for all elliptic curves, but even without that, the Birch-Swinnerton-Dyer conjecture could subsume it to the extent of including the stipulation that L(E,s) is analytic at s=1.
Let's take a closer look. The Euler product contains factors of the form
(1 - a_{p}p^{-s} + p^{1-2s})^{-1} = (1 - (p+1-N(p))p^{-s} + p^{1-2s})^{-1}If you plug in s=1 the factors are simply p/N(p). Now, by Hasse's principle, N(p) (the number of points in E(F_{p})) should be large if and only if E(K) is large, i. e. infinite, with rank ≥ 1. As we saw, N(p) can't get too large. It's about p+1. So the typical term p/N(p) is a little less than 1. If the typical term didn't approach 1 too closely, then the infinite product ∏_{p} p/N(p) should be zero. This is heuristic reasoning, of course, not rigorous, but it suggests that indeed we should have L(s,1) = 0 if and only if there are infinitely many points in E(K).
To refine this conjecture a little, we could look at the reciprocal of the infinite product. If you consider only a finite number of terms of that reciprocal, you should get a product that grows without limit as the number of terms increases if and only if the rank r ≥ 1. So perhaps we can find an asymptotic formula for this product that involves the number r. Based on extensive numerical computations for curves known to have nonzero rank, Birch and Swinnerton-Dyer conjectured the asymptotic relationship
∏_{p<x} N(p)/p ∼ C(log x)^{r} for some constant CThis form of the conjecture still doesn't involve the L-function directly. In some cases (such as when the elliptic curve E has the property of "complex multiplication") the Hasse-Weil conjecture was known to be true, hence L(E,s) had a known functional equation. The form of this equation makes it possible to compute L(E,1) directly. Not surprisingly, it turned out to be zero. But even more could be computed, namely the limit as s → 1 of L(E,s)/(s-1)^{r}. It was found that this tended to a finite but nonzero value C(E) as s → 1, which means that the Taylor series expansion of L(E,s) has the form
L(E,s) = C(E)(s-1)^{r}/r! + ∑_{r<n<∞} c_{n}(s-1)^{n}/n!This means that the conjecture can be stated even more precisely as: The rank of E(K) is r if and only if r is the exact order of the zero of L(E,s) at s=1.
But we needn't stop there. Considerably more can be said about the constant C(E). Theoretical studies suggested it should be a product of several factors. There are conjectural formulas for C(E) when K is any number field, but the simplest case is when K = Q:
C(E) = Ω R(E/Q) |E(Q)_{t}|^{-2} (∏_{p}c_{p}) |Ш(E/Q)|All but one of the factors in this formula are well-understood and reasonably easily calculated. Ω is called the "period" of the curve. It is the integral of a differential form over E(R). R(E/Q) is called the "regulator" of the curve. It is the volume of a fundamental cell in a certain lattice that can be constructed when the rank of the curve is nonzero. E(Q)_{t} is just the torsion subgroup of the Hasse-Weil group. The numbers c_{p} are all 1 unless p | 2Δ(E/Q). In that case, E has bad reduction at p, and the corresponding c_{p} describe roughly how "bad" the curve is over the p-adic field Q_{p}.
The last factor, |Ш(E/Q)|, is the order of a group called the Shafarevich-Tate group, after I. R. Shafarevich and John Tate. (Ш is the Cyrillic character "Sha".) This group reflects how badly the Hasse principle fails to hold for the given curve. Very little is known about Ш(E) in general, even whether or not it is finite. The Shafarevich-Tate conjecture states that the group is finite, and that conjecture is subsumed in this detailed form of Birch and Swinnerton-Dyer's conjecture. When E is a curve of known rank r and C(E) is computed, then if C(E) is divided by all of the other factors which can be computed, what is left has been found to be the square of an integer in all cases. This is precisely what would be expected for the order of Ш(E).
In 1983 B. Gross and Don Zagier showed that if E is a modular elliptic curve and L(E,s) has a first order zero at s=1, then there are infinitely many rational points, so the rank is at least 1. This provides a partial converse for the Coates-Wiles result. In 1990 V. A. Kolyvagin improved on this to show, for modular elliptic curves, that L(E,1) ≠ 0 imples r = 0, thus extending the Coates-Wiles result to all modular curves. And further, L(E,1) = 0 but L′(E,1) ≠ 0 implies r = 1.
Given that all elliptic curves are now known to be modular, the conjecture (except for the explicit form of the leading term of L(E,s)) is now settled when L(E,s) ∼ c(s-1)^{k}, with c ≠ 0 and k ≤ 1:
If the order of the zero of L(E,s) at s=1 is 0 or 1, then the rank of E(Q) is the order of the zero of L(E,s) at s=1.It is also known that the Shafarevich-Tate conjecture is true if the order of the zero of L(E,s) at s=1 is 0 or 1.
Tantalizingly, almost nothing is known if L(E,s) has a zero of order more than 1 at s=1. For instance, it could be that the rank of E(Q) is 1, yet L(E,s) has a zero of order more than 1. Therefore, the conjecture is still an open question when L(E,1) is zero to order higher than 1. Interestingly enough, examples of elliptic curves are known with any r ≤ 12, and there is even an infinite family of curves with r ≥ 12, yet none of these cases (or any others) are known to have L(E,1) zero to an order more than 3.
The case of elliptic curves in the complex numbers is especially interesting, not only because C is algebraically closed, but also because of the richness of calculus for complex functions. In particular, the equation of an elliptic curve defines y as an "algebraic function" of x. What makes algebraic functions somewhat tricky is that they are not in general single-valued. Clearly, with an equation of the form y^{2} = f(x), there are usually two choices for the square root of any complex number, so the appropriate value of y corresponding to any x is ambiguous.
Bernhard Riemann figured out how to solve this problem so that y could be given as a single-valued function of x for x in an appropriate domain of definition. Naturally, this domain came to be known as a "Riemann surface". The domain looks locally like a portion of the complex number plane, which means, in modern terms, that it is a 1-dimensional complex manifold. Such a manifold can be regarded as a "curve" over C, as well as a 2-dimensional manifold over R, i. e. a "surface". For y defined as an algebraic function of x by the equation y^{2} = f(x), where f(x) is a third degree polynomial in x, it turns out that the corresponding Riemann surface on which y is a single-valued function can be identified with the elliptic curve E defined as a locus of points in C -- namely E(C)
So an elliptic curve is a Riemann surface. In fact, it is of a special type: a compact Riemann surface of "genus" 1. There are several equivalent ways to define the numerical genus. Topologically, the genus counts the number of "holes" in a surface. For example, a surface with one hole is a torus. The converse is also true: every compact Riemann surface of genus 1 is an elliptic curve. In other words, elliptic curves over the complex numbers represent exactly the "simplest" sorts of compact Riemann surfaces with non-zero genus. We'll explain this in more detail later.
This topological equivalence of an elliptic curve with a torus is actually given by an explicit mapping involving a function, called the Weierstrass ℘-function, and its first derivative. This mapping is, in effect, a parameterization of the elliptic curve by points in a "fundamental parallelogram" in the complex plane.
That is a summary of the situation. It is very worthwhile to see step by step how this relationship between elliptic curve and Riemann surface actually comes about. In so doing, we'll also encounter some interesting functions of great importance, the so-called "modular functions" in particular.
ax^{3} + bx^{2}y + cxy^{2} + dy^{3} + ex^{2} + fxy + gy^{2} + hx + iy + j = 0Such an equation is very cumbersome to work with, and in fact much more complicated than we really need. The truth is that there are many equations which represent essentially the same information. For a given curve E we are first of all interested in its set E(Q) of rational points. Secondly, we are interested in the group properties of the sets E(Q), E(R), and E(C). We can therefore change the equation of the curve in any way we like as long as there is a 1-to-1 correspondence between the resulting point loci in such a way that the group structure is preserved. Such a correspondence is a group isomorphism.
It can be shown that if E is defined over Q and has at least one rational point (so E(Q) isn't empty) then there is a change of coordinates such that one has a much simpler equation:
y^{2} = Ax^{3} + Bx^{2} + Cx + Dand there is an isomorphism between the "old" and the "new" groups of points. Such a change of coordinates is called a "birational transformation" because it is reversible and yields a 1-to-1 relationship between the rational points.
In fact, even more can be done, and the equation can be simplified further to the form:
y^{2} = 4x^{3} - ax - bThis is called the Weierstrass normal form of the equation of an elliptic curve, for reasons which will become clearer very soon.
Another way of stating this result is that birational equivalence is an equivalence relation on curves defined by a cubic equation. The groups E(Q), E(R), and E(C) are preserved (up to isomorphism) by this equivalence relation, and in every equivalence class there is a curve whose defining equation has the Weierstrass normal form.
Recall that it was also required that the curves we are considering be nonsingular. This means they have no "singular" points where a unique tangent to the curve doesn't exist, so the curve doesn't cross itself or have a "cusp". (This was necessary to be able to define the group structure.) This property of nonsingularity is also preserved by the birational equivalence relation. It is equivalent to the condition that the defining equation have no repeated roots. But if the equation is given by y^{2} = f(x), that is equivalent to the requirement that the discriminant Δ of f(x) be nonzero. When f(x) is in Weirstrass normal form, Δ has an especially simple form:
Δ = a^{3} - 27b^{2}So we must have Δ ≠ 0. The importance of this condition will appear several times.
This new way of obtaining elliptic curves uses what are known as elliptic functions. Here again, the reason for calling these functions "elliptic" has to do with their relation to integrals used to calculate the arc length of an ellipse. These integrals (known, of course, as elliptic integrals) turn out to be exactly the way in which one can obtain an elliptic function corresponding to any elliptic curve.
At first glance, the definition of an elliptic function doesn't appear to involve either elliptic curves or elliptic integrals. Specifically, an elliptic function is defined as a doubly periodic meromorphic function. Recall that a meromorphic function is just a complex function which is defined and analytic on all of C, except for isolated poles. (A pole is a type of singularity -- a point near which values of the function are arbitrarily large.) A (singly) periodic function f(z) has the property that for some ω∈C, f(z+ω) = f(z) for all z. If this holds, then in fact f(z+nω) = f(z) for all z and all integers n. f(z) = sin(z) and f(z) = e^{z} are examples of singly periodic functions, with ω = 2π and ω = 2πi, respectively. Similarly, a function is doubly periodic if there are two distinct ω_{1} and ω_{2} in C such that f(z) is periodic with respect to each of them, or equivalently, if f(z+mω_{1}+nω_{2}) = f(z) for all z and all integers m, n.
Suppose that F(z) is a doubly periodic meromorphic function (i. e. an elliptic function) with periods ω_{1} and ω_{2}. If, in addition, F(z) is not a constant function, there are constraints on what periods are possible -- in particular, ω_{1}/ω_{2} can't be a real number. This is because if F(z) is doubly periodic and nonconstant, it must have at least one pole, by a classic theorem due to Joseph Liouville (1809-82). Without loss of generality, it can be assumed one of its poles is at z=0. Further, since F(z) is meromorphic, it can't have poles arbitrarily close together, by another basic theorem about meromorphic functions. This implies that the ratio ω_{1}/ω_{2} is not a real number, because if it were, we could approximate it by a rational number and hence choose m, n ∈ Z such that mω_{1} + nω_{2} is arbitrarily close to 0. But F(z) has a pole at 0, so it would have two poles arbitraily close to each other. Another way to say that ω_{1}/ω_{2} isn't real is to say that ω_{1} and ω_{2} are "linearly independent" over R. So the net result is that the two periods of an elliptic function must be linearly independent over R.
Periodicity is a very special property that places significant constraints on the nature of functions that have the property. For example, it can be shown that there are no nonconstant meromorphic functions which are triply periodic. (Because any three distinct complex numbers must by linearly dependent over R. Take such a dependence relation, approximate the real coefficients by rationals, and you can make an argument similar to that of the last paragraph that a triply periodic function would have poles arbitrarily close together.)
Now, if we are given any elliptic function F(z) with periods ω_{1} and ω_{2} we can construct another elliptic function with the same periods and very special properties. In fact, we don't even need to start with an elliptic function F(z). We can simply start with any two complex numbers ω_{1} and ω_{2} such that ω_{1}/ω_{2} isn't real.
If ω_{1}, ω_{2} ∈ C and ω_{1}/ω_{2} ∉ R, we define a lattice as a set of points in C of the form {mω_{1} + nω_{2} | m, n ∈ Z}. We'll use the symbol L or L(ω_{1}, ω_{2}) to designate a lattice. L consists of all sums of integral multiples of ω_{1} and ω_{2}. In the notation of group theory, L can be written as a direct sum: L = Zω_{1}⊕ Zω_{2}. The numbers ω_{1} and ω_{2} that determine a lattice are called a basis of the lattice. In general, the basis isn't unique, but there are ways to add further conditions to specify a basis almost uniquely. When such a basis is chosen, the ratio τ = ω_{1}/ω_{2} is unique for a specific lattice, provided also that τ lies in a region of the upper half of the complex plane such that -1/2 < Re(τ) ≤ 1/2, |τ| ≥ 1, and Re(τ) ≥ 0 if |τ| = 1. This particular region has some importance which will be explained later.
Given all that, for any lattice L(ω_{1}, ω_{2}) (and hence for any nonconstant elliptic function, whose periods determine a lattice), we can define a special function
℘(z) = ℘(z; ω_{1}, ω_{2}) = 1/z^{2} + ∑_{ω∈L-{0}} ((z-ω)^{-2} - ω^{-2})(The summation is over all elements of the lattice except for 0.) This definition of ℘(z) as well as the notation is due to Karl Weierstrass (1815-97), who developed much of the theory. The function is called the Weierstrass ℘-function. There are various messy details, but it can be shown that this series converges and defines a meromorphic function. From the definition, it is plausible (and not hard to prove) that ℘(z) is doubly periodic with periods ω_{1} and ω_{2}.
What we've found, then, is that any nonconstant elliptic function (or lattice) determines the special elliptic function ℘(z) and lets us write down a series expansion for it. A little further computation using this explicit expression allows us to deduce one very important property of ℘(z), namely that it satisfies the differential equation:
℘′^{2} = 4℘^{3} - g_{2}℘ - g_{3}where the coefficients are expressible in terms of the periods. Specifically, if we define
G_{k} = ∑_{ω∈L-{0}} ω^{-k}then g_{2} = 60G_{4} and g_{3} = 140G_{6}. Further, the power series for ℘(z), obtained by rearranging terms in the defining series, is simply
℘(z) = 1/z^{2} + ∑_{1≤k<∞} (2k+1)G_{2k+2}z^{2k}Not all elliptic functions are Weierstrass ℘-functions, but all are very closely related to ℘(z). First of all, note that the set of all elliptic functions with specified periods (including constant functions) form an algebraic field, because all sums, products, and reciprocals of functions that have the periods ω_{1} and ω_{2} also have the same periods. (For the moment, assume we are talking only of functions with two specific periods.) ℘-functions have the property that ℘(-z) = ℘(z). This is an immediate consequence of the power series for ℘(z). Such a function is called an "even" function. The set of all even elliptic functions also form a field. With a bit of work one can show that this field consists of all quotients of polynomials in ℘(z), and so the field is denoted C(℘(z)). ℘′(z) has the property that ℘′(-z) = -℘′(z), which is clear from its power series as well, and such a function is said to be "odd". Although odd elliptic functions don't form a field (the square of an odd function isn't odd, for example), they all have the form of ℘′(z) times an even elliptic function. Now, any function at all can be written as the sum of an even function and an odd function. It follows that all elliptic functions are of the form g(℘(z)) + ℘′(z)h(℘(z)), where g(t) and h(t) are quotients of polynomials in the indeterminate t. In other words, all elliptic functions can be expressed rather simply in terms of ℘(z) and ℘′(z).
The net result of all this is that, given a particular lattice of periods, we can explicitly construct an elliptic function ℘(z) as a series. From that and ℘′(z), we can then easily express any elliptic function with the same periods. Moreover, we obtain a differential equation for ℘(z) with coefficients determined explicitly by the periods.
Significantly, the differential equation satisfied by ℘(z) can be interpreted in another way. What it says is nothing less than that for all z∈C, the pair of values (℘(z),℘′(z)) lies on the elliptic curve E(C) whose defining equation is in Weierstrass normal form:
y^{2} = 4x^{3} - g_{2}x - g_{3}The obvious question to ask now is whether every point on the curve E(C) is obtained in this way. In order to answer this, we need more facts about ℘(z). Consider the set of points P in the complex plane defined by a parallelogram whose vertices are 0, ω_{1}, ω_{2}, and ω_{1} + ω_{2}. P is known as the fundamental parallelogram of the lattice, with respect to the given basis. (Remember, the basis isn't necessarily unique.) P can be defined explicitly by
P = {aω_{1} + bω_{2} | a,b ∈ R, 0 ≤ a,b < 1}The fundamental parallelogram plays a very important role, as we shall see. The entire complex plane can be covered with translated copies of P so that every point of C is in one and only one translate. Furthermore, by periodicity, any elliptic function having periods ω_{1} and ω_{2} takes the same values at corresponding points of parallel sides of P. What this means is that if we "identify" opposite sides of P in a topological sense, the elliptic function is well-defined on this new topological space. In topology, the space obtained by identifying opposite sides of a parallelogram is just a torus. In this case, since it's obtained from points of the complex plane, it's called a complex torus.
Going back to ℘(z), we note that it has a pole of order 2 at 0. (That's due to the term 1/z^{2} in the series expansion.) There are other poles at the other vertices of P, but none inside P. An elliptic function that, counting multiplicity, has n poles in its fundamental parallelogram (which includes 0 but not the other vertices) is said to have order n. A fundamental fact about elliptic functions of order n is that as z ranges over points of the fundamental parallelogram, the function takes every complex value exactly n times, not just the special value ∞ (which is by definition the value at poles of the function). For instance, an elliptic function of order n also has n zeros in its fundamental parallelogram.
So we know that, in particular, ℘(z), which has order 2, takes every complex value exactly twice as z ranges over P. It follows from this that every point on an elliptic curve E(C) is of the form (℘(z),℘′(z)) for the case of elliptic curves whose equation has the same coefficients as the differential equation satisfied by ℘(z). In this situation, we say that the elliptic curve is "parameterized" by the two functions ℘(z) and ℘′(z), because every point on the curve is the image (twice) of the mapping from P to E(C) given by z → (℘(z),℘′(z)).
Of course, not every elliptic curve has a defining equation in Weierstrass normal form, not by a long shot. However, we know that every elliptic curve is birationally equivalent to a curve whose equation has the required form. We're still not quite done yet, though. For suppose we have an elliptic curve with defining equation
y^{2} = 4x^{3} - ax - bWe can consider the differential equation
F′(z)^{2} = 4F(z)^{3} - aF(z) - bBut what we don't know yet is whether every such equation (i. e. with arbitrary a,b ∈ Q) has a solution F(z) which is an elliptic function, and is such that a=g_{2} and b=g_{3}, where g_{2} and g_{3} are obtained as above from the periods of the elliptic function.
Suppose for just a moment that we knew this was the case. Then we would know that in every equivalence class of elliptic curves (as explained above) there is one curve which is parameterized in the indicated way by a ℘-function. And there can be only one, because the curve is completely determined as the image of the mapping from P to E(C).
So we have to look at solving the differential equation:
F′(z)^{2} = 4F(z)^{3} - aF(z) - bAlthough it looks a little fearsome since it's nonlinear, it isn't really that hard to solve, since it's also of first order. If G(w) is the inverse function of F(z) so that z = G(w), then by the inverse function theorem of elementary calculus,
G′(w) = 1/F′ = (4F^{3} - aF - b)^{-1/2}.Taking the indefinite integral of this yields
z = G(w) = ∫^{w} (4w^{3} - aw - b)^{-1/2} dwWe've deliberately solved for the inverse function G(w) to F(z), because the answer is an "elliptic integral", which arises from computing the arc length of an ellipse. This is the thing we've mentioned several times. It is how the term "elliptic" comes into the picture. The function F(z) that we want is the inverse function of this elliptic integral.
At this point, things get rather messy. There are two problems. First, in order to evaluate this integral, it is necessary to specify a path of integration, because we are dealing with complex variables. There are infinitely many ways to get from point A to point B in the complex plane, and the integral may be different along each such path. The second problem is that since the function under the integral sign involves a square root, it is not well-defined, because there are two possible values for any square root (except the square root of 0). Fortunately, it isn't necessary to pick a consistent definition of the square root everywhere. The definition need be consistent only along a particular path.
These problems can be handled, but a great deal of sophisticated machinery had to be developed to accomplish this. In order to do the job rigorously, mathematicians eventually defined precise tools such as "Riemann surfaces", "analytic continuation", "homotopy theory", and "covering spaces". To make a long story short, the final result is that elliptic integrals like G(w) can be precisely and unambiguously defined. The variable of integration is no longer regarded simply as a complex number, but as a point on a Riemann surface where the square roots can be defined as single-value functions. The paths of integration lie on the Riemann surface. The complex numbers C make up the simplest sort of Riemann surface, and it is possible to extend the notion of analytic and meromorphic functions to more general Riemann surfaces. G(w) turns out to be meromorphic, and its inverse function F(z) is also. Moreover, F(z) satisfies the original differential equation, and -- most importantly -- F(z) is an elliptic function with periods ω_{1} and ω_{2} which can be expressed explicitly as suitable elliptic integrals over closed paths.
The final result is quite beautiful. Given any lattice L ⊆ C (or equivalently two complex numbers ω_{1} and ω_{2} whose ratio is not a real number), we can construct an elliptic function ℘(z) that has ω_{1} and ω_{2} as periods and which satisfies the differential equation
℘′^{2} = 4℘^{3} - g_{2}℘ - g_{3}where g_{2} and g_{3} can be expressed as simple infinite sums involving the periods. Moreover, ℘(z) determines an elliptic curve as the image of the fundamental paralellogram P of the lattice under the map φ(z) such that φ(z) = (℘(z),℘′(z)).
Conversely, if we start with an elliptic curve E(C) that has the defining equation
y^{2} = 4x^{3} - g_{2}x - g_{3}we can construct an elliptic function as the inverse of an elliptic integral whose integrand involves the square root of the right hand side of the equation above. Moreover, the periods of the elliptic function can be expressed by integrals over suitable closed paths, g_{2} and g_{3} can be expressed in terms of the periods, and the elliptic function satisfies the appropriate differential equation.
The elliptic integral function has a Riemann surface on which it is unambiguously defined, and this Riemann surface is nothing other than the elliptic curve E(C). The elliptic integral makes E(C) into what is called a "double covering" of the complex plane C.
Suppose two abstract things we'll denote by x and y are related to each other in a way we want to describe. Symbolically, we write x R y for this state of relationship. If x and y are ordinary numbers, then x ≤ y is an example of a relation. We define an "equivalence relation" axiomatically by three conditions on R:
Whenever we talk about a relation R we do so in the context of a specific set of objects S (although R may be applicable to a variety of different sets). In set theoretic terms, R may be thought of as a subset of the "cartesian product" S × S, though we don't need to go further into that. All it means is that the relation may be thought of as a subset of certain ordered pairs in S × S. Symbolically, R = {(x,y) ∈ S×S | x R y}
The most important thing that an equivalence relation R does is to partition the set S into a number of different subsets called "equivalence classes". If x ∈ S, the equivalence class containing x is simply the set of all y ∈ S such that x R y. Consequently, every x is in some R-equivalence class, even if the only member of the class is x itself. Furthermore, no two equivalence classes have any elements in common, because, by the transitivity property, if two classes had any element in common, all elements of the two classes would be equivalent to each other, so there would be really only one class. Hence every member of S lies in one and only one R-equivalence class. Another way to say this is that R "partitions" S into "disjoint" equivalence classes.
To take care of this problem, and to give an important concrete example of equivalence classes as outlined above, we will define an equivalence relation on the set of all lattices in C, so that the resulting set of equivalence classes of lattices will, eventually, turn out to be in 1-to-1 correspondence with equivalence classes of elliptic curves.
We say that two lattices L and L′ are equivalent if there is a nonzero complex number λ ∈ C such that L′ = λL. What that last equality means is that for any α ∈ L, λ&alpha is an element of L′ (so λL ⊆ L′), and, in addition, every element of L′ is of this form. Another term used is that two such equivalent lattices are "homothetic". We shall use the latter term to avoid ambiguity, since there will be other types of "equivalence classes" that we want to talk about. In this terminology, homothetic lattices are said to belong to the same "homothety class".
There is a simple way to think of homothetic lattices. Recall that any complex number can be expressed in the form z = |z|e^{iθ} = |z|e^{iarg(z);}, where arg(z) is the angle θ between the positive real axis and a line from the origin to z in the complex plane. The lattice λL has a fundamental parallelogram which is just the parallelogram of L after being stretched by a factor of |λ| and rotated by arg(λ). The interior angles of the two parallelograms are the same.
We can use the notion of homothety to simplify the way we represent (classes of) lattices. As described above, a lattice L can be defined by a basis consisting of two R-lineraly independent complex numbers ω_{1} and ω_{2}: L = {mω_{1} + nω_{2} | m, n ∈ Z}. Given that, a basis for λL is simply λω_{1} and λω_{2}. So, if we choose λ = 1/ω_{2}, a basis of λL is simply the pair (&tau,1), where τ = ω_{1}/ω_{2}. Recall that by switching the order of ω_{1} and ω_{2} if necessary we can assume Im(τ) > 0.
As a matter of notation, we let L_{τ} designate the lattice with basis (τ,1). What we have so far is that for any L there is a homothetic L_{τ}. In fact, every homothety class contains infinitely many lattices of the form L_{τ}. But we will shortly find a way to make the choice of τ unique and to identify all τ that give L_{τ} in the same homothety class.
The complex numbers C have the structure of both an additive group and a 1-dimensional complex analytic manifold (essentially, a "Riemann surface"). A lattice L is an additive subgroup of C, so one has the quotient group C/L. If you aren't familiar with group theory, we can describe "quotient groups" easily using the notion of equivalence classes. We do this by saying that two complex numbers in C are equivalent if their difference is in L. That is, two points z_{1} and z_{2} are equivalent (symbolically, z_{1} ≡ z_{2}) just in case z_{1} - z_{2} ∈ L. We thus get a set of equivalence classes with respect to this relation (or "modulo L".) Call this set C/L. On this set we can define a group structure simply be picking complex numbers that "represent" each equivalence class, defining the group operation on them as normal addition, and then passing to the equivalence class of the sum. This procedure is well defined and makes C/L into a group because L was a subroup of C. (One can always do this for abelian groups. For nonabelian groups, making quotient groups requires an additional condition on the subgroup.)
Another way to think of C/L is as the fundamental paralellogram P of L with addition done "modulo L": if z_{1}, z_{2} ∈ P then the sum z_{1} + z_{2} is either also in P, or else corresponds to a unique element of P plus one or both basis elements of L. In either case, C/L inherits its group structure from C: the sum of two equivalence classes is just the class that contains the sum of a representative from each class.
Moreover, C/L can also be given a topological structure, derived from the topology of C, by "identifying" opposite sides of P. This means that corresponding points are considered to be the same, and a topology can be defined consistenly that reflects the identification. Think of P as being made of flexible material. When you identify one pair of opposite sides, you get a hollow cylinder. When you identify the ends of the cylinder -- the other two sides of P -- you get a surface that's like the surface of a donut. Such a surface is called a torus. So, topologically, C/L is essentially a torus. Best of all, even the analytic structure is preserved under this identification procedure, so that C/L can be made into a complex analytic manifold -- a Riemann surface. A complex manifold like C/L is called, naturally, a "complex torus". Topologically, a manifold with 2 real dimensions (or 1 complex dimension) is some sort of surface. Surfaces may be classified by the number of "holes" they have, which is an invariant known as the "genus". Since a complex torus is like the surface of a donut, it has a genus of 1.
Now recall that the Weierstrass ℘-function is doubly periodic with its periods being the basis of a lattice L with fundamental parallelogram P. This means that ℘ takes the same values at corresponding points on opposite sides of P, so that ℘ is consistently defined on C/L.
OK, but so what? Here's the most important thing: the map φ: C/L → E(C) defined by φ(z) = (℘(z),℘′(z)) has all the right properties. E(C) already has the structure of a complex analytic manifold, because it is defined as the set of zeros of a cubic polynomial in two variables. (Technically, E(C) is regarded as a subset of the "complex projective space" P_{2}(C), but we have been trying hard to avoid discussing projective spaces, due to the extra level of abstraction involved. It's simpler, though less correct, to think of E(C) as a subset of the product space C×C = C^{2}.) By "right properties", we mean that φ is 1:1, onto, and preserves all of the complex analytic structure.
In essence, the map φ shows that the elliptic curve E(C) is fundamentally the "same" object as C/L. But we saw that C/L has a natural group structure. This suggests that E(C) by rights "must" have a group structure as well. This group structure can be defined by using φ to transfer the group structure of C/L to E(C). Of course, it was already known that E(C) had a group structure, as we discussed early on. The rather amazing thing is that φ gives the exact same group structure. Since φ is defined using the ℘-function, the reason this works is that ℘(z) has a simple "addition formula". That is, there is a simple expression for ℘(z_{1}+z_{2}) in terms of ℘(z) and ℘′(z) at z_{1} and z_{2}. Specifically, if z_{1} and z_{2} are not "equivalent modulo L", i. e. their difference isn't in L, then
℘(z_{1}+z_{2}) = (℘′(z_{1}) - ℘′(z_{2}))^{2} / 4(℘(z_{1}) - ℘(z_{2}))^{2} - ℘(z_{1}) - ℘(z_{2})In summary, if you have a lattice L, then you have determined an elliptic curve E(C) that is, in turn, analytically isomorphic to the complex torus C/L. Conversely, if you start with an elliptic curve, you can first get a birationally equivalent curve whose equation is in Weierstrass form, and from that you can get an elliptic function whose periods are the basis of a lattice L, such that the map φ gives an analytic isomorphism from the complex torus C/L to the elliptic curve E(C).
That is, there are very nice correspondences between elliptic curves, complex tori, and lattices. It is a sure sign of interesting mathematics when there are such correspondences between very different sorts of objects. But the story just gets even more interesting.
In particular, if ω′_{1} and ω′_{2} give a different basis for L, then it is possible to write
ω′_{1} = aω_{1} + bω_{2} and ω′_{2} = cω_{1} + dω_{2}for integers a, b, c, and d. The fact that we are working with lattices is what guarantees that these coefficients are integers.
In linear algebra, this sort of relationship is called a linear transformation, and "matrix notation" is ordinarily used to express it:
In this we have represented the basis pairs such as ω_{1} and ω_{2} as "column vectors" according to the usual conventions for multiplying matrices.
(
ω′_{1} ω′_{2} ) = (
aω_{1} + bω_{2} cω_{1} + dω_{2} ) = (
a b c d ) × (
ω_{1} ω_{2} )
Now the matrix containing the coefficients is of a special sort. Not only are its matrix elements integers, but the matrix is "invertible", because it gives a "nonsingular" linear transformation from one basis to another. That is, the inverse of the matrix also has integer entries. In linear algebra, it is shown that this implies the "determinant" of the matrix -- which is the quantity ad - bc -- must be nonzero. And since the inverse matrix also has integer elements its determinant is an integer which is the reciprocal of the original determinant. The value of the determinant must therefore be ±1. However, by switching the order of basis elements if necessary, it can be assumed that the determinant ad-bc = 1.
2 × 2 matrices of this form, with integer entries and determinant 1 are important for many reasons, some of which we shall soon see. The set of such matrices has, therefore, been given a name: SL_{2}(Z). It is in fact a group under multiplication, called the "special linear group" (of 2 × 2 matrices with entries in Z).
For any finite set S it is always possible to construct groups that act on S. For instance, consider a set of 5 elements represented by the numbers 1 through 5: S = {1,2,3,4,5}. A "permutation" of S is just a rule that specifies how the elements of S can be reordered ("permuted") in some way. Let σ represent such a rule. Then one example would have σ(1)=1, σ(2)=3, σ(3)=4, σ(4)=2, and σ(5)=5. This is like a very simple kind of encryption that simply substitutes one symbol for another so that two different symbols never get mapped to the same symbol. In the example given, σ could be expressed succinctly by the notation (1)(234)(5), or more simply, (234). The group of all possible permutations on a set is called the "symmetric group". If the set has n elements, S_{n} is the usual notation for the symmetric group on n elements. (It's only a coincidence that S refers both to a set and to the symmetric group.) Subsets of symmetric groups are called permutation groups. We won't go into this any further, but any finite group can be "represented" in terms of a permutation group on some set.
For an example more relevant to our present interests, consider the set S = {(ω_{1},ω_{2}) ∈ C*×C* | Im(ω_{1}/ω_{2}) > 0} of pairs of nonzero complex numbers. Every element of this set is a basis for some lattice L. But this isn't a 1-to-1 relationship, since many elements of S can generate the same lattice -- choice of basis is not unique. We can get a much smaller set by considering equivalence classes of such pairs, where the equivalence classes are defined by the condition that two pairs are related by a linear transformation (whose matrix is) in SL_{2}(Z). Now the members of any particular equivalence class are labeled by elements of SL_{2}(Z).
Let G = SL_{2}(Z). Then there is a 1-to-1 correspondence between equivalence classes of S under the action of G and distinct lattices. Within any particular equivalence class -- a G-orbit -- corresponding to a particular lattice L, each element of the class is a specific basis pair for L, and each is labeled by an element of G. In fact, the labeling is unique, because if M_{1} and M_{2} are two matrices of G that take some basis pair s∈S to the same thing -- so M_{1}s = M_{2}s -- then (M_{1-1}M_{2})s = s. But the only element of G that leaves some basis pair unchanged is the identity element I of G (that is, the diagonal matrix with a=d=1, b=c=0), and so M_{1} = M_{2}. In a case like this, we say G acts "faithfully" on S. The net result is that there is a 1-to-1 correspondence between basis pairs for a particular lattice L and elements of SL_{2}(Z). Since the latter is a pretty large set, it's clear that there are a lot -- infinitely many -- of basis pairs for any lattice L.
To summarize: the equivalence classes of S under the action of G correspond to distinct lattices, and members of each equivalence class correspond to elements of G.
Suppose we have two homothetic lattices L and λL. What can we say about the corresponding complex tori C/L and C/λL? It turns out that by rather straightforward algebra that they are isomorphic. The details can be worked out by considering the map λ: C → C given by multiplcation by λ. This induces a map of C/L to C/λL, because two numbers z_{1} and z_{2} are equivalent modulo L if and only if λz_{1} and λz_{2} are equivalent modulo λL.
Since we know that there is a correspondence between elliptic curves and complex tori, there is also a correspondence between elliptic curves and homothety classes of lattices. Because of this, we want to understand homothety classes better.
The answer is yes. Let S be the set of lattices of the form L_{τ} = {aτ + b | a,b ∈ Z}, where Im(τ) > 0. We know that every homothety class contains at least one lattice selected from S. In fact (we will find) there are infinitely many elements of S in every homothety class. Yet, of course, not all members of S are homothetic to each other. The interesting question is exactly when L_{τ} and L_{τ′} are homothetic if τ ≠ τ′.
So suppose L_{τ′} = λL_{τ}. Since (τ,1) is a basis of L_{τ}, (λτ,λ) is a basis of λL_{τ}. Hence there are integers a, b, c, d such that
τ′ = aλτ + bλ and 1 = cλτ + dλFor future reference, note that this implies λ = 1/(cτ + d). From these equations it follows that
τ′ = (aλτ + bλ)/(cλτ + dλ) = (aτ + b)/(cτ + d)This suggests that for
we define an action of M on an element of S by the rule M(L_{τ}) = L_{τ&prime} where τ′ is given by the formula above. Note that this action of SL_{2}(Z) is subtly different from the action we discussed earlier on the set of lattice basis pairs -- yet the close relationship is obvious, for if basis pairs are related such that
M = (
a b c d ) ∈ SL_{2}(Z)
ω′_{1} = aω_{1} + bω_{2} and ω′_{2} = cω_{1} + dω_{2}then the ratios of pair elements
τ = ω_{1}/ω_{2} and τ′ = ω′_{1}/ω′_{2}satisfy
τ′ = (aω_{1} + bω_{2}) / (cω_{1} + dω_{2}) = (aτ + b)/(cτ + d)just as before. The net result of these calculations is that the orbits of S under this action of G correspond to homothety classes of lattices. Within each orbit there are many lattices L_{τ}, each one corresponding to a different element of G.
Almost. We have to make a slight qualification here, because for any M ∈ SL_{2}(Z), it is clear that -M acts on S in exactly the same way as M. (This is not the case in the earlier example of SL_{2}(Z) acting on the set of lattice basis pairs.) In other words, the action of SL_{2}(Z) on S isn't quite "faithful" in the sense defined above.
However, a small change takes care of this problem. We define the group Γ = SL_{2}(Z)/{I,-I}. This is the quotient group SL_{2}(Z) modulo the 2-element subgroup of the identity element I and its negative. What this means is that in Γ no distinction is made between a matrix and its negative. Another name sometimes used for the modular group is PSL_{2}(Z). (PSL = "projective special linear".)
Γ is the group which is known as the modular group. (The fact that Γ is the Greek equivalent of G attests to the fundamental importance of this group.) For the most part we will continue to describe elements of Γ as if they were matrices in SL_{2}(Z) without being fussy about the difference. The technical advantage of using Γ instead of SL_{2}(Z) is that the group action is faithful, and there is a 1-to-1 correspondence between elements of the group and members of any particular equivalence class.
Just one more observation along these lines, but a crucial one. It is that there is also an action of Γ on the upper half of the complex plane: H = {Z∈C | Im(z) > 0}. We've already seen what it is: for any M∈Γ and τ∈H, let M(τ) = (aτ + b)/(cτ + d).
Several things need to be true in order for this to be a valid action of Γ on H. We didn't give the details before in the case of the action on the set of lattices of the form L_{τ}, but, for instance, it needs to be checked that M(τ)∈H whenever τ∈H. However, since one can compute
Im(M(τ)) = det(M) Im(τ)/|cτ + d|^{2}and det(M) = 1, this follows. We also need to have M_{2}(M_{1}(τ)) = (M_{2}M_{1})(τ), where M_{2}M_{1} is the matrix product, but that is also an easy calculation.
Functions having the form just given for M(τ) are obviously very simple rational functions, but they turn up quite frequently and have been extensively studied. Hence they have been given various names, such as "linear fractional transformations", "fractional linear transformations", and "Möbius transformations". (A. F. Möbius (1790-1860), who came up with the "Möbius band", was one of the mathematicians who studied such functions.) Möbius transformations also play an important role in some forms of non-Euclidean geometry, for example.
What do the equivalence classes of H under the action of Γ look like? At this point we need to define the subset F⊆H by
F = {z∈H | -1/2 < Re(z) ≤ 1/2; |z| ≥ 1; Re(z) ≥ 0 if |z|=1}F may be described as a semi-infinite rectangle of width 1 centered on the imaginary axis of H, except that the lower boundary is a portion of the circle |z|=1. F is called the "fundamental domain" for the action of Γ on H. It can be shown, though the proof is a little tedious, that F has the property that no two points of F are Γ-equivalent, yet every point of H is Γ-equivalent to (exactly) one point of F. In other words, every equivalence class of H under the action of Γ is represented by exactly one point of F. Or in still other words, there is a 1-to-1 correspondence between equivalence classes of H under the action of Γ and points of F. The points of F thus provide unique "labels" for the set of equivalence classes. Within each equivalence classes, the points are uniquely labeled by elements of Γ.
There is a standard notation used for this set of equivalence classes: H/Γ. (This is often written Γ\H, since one usually writes the action of M∈Γ on z∈H as Mz, with M on the left.)
Now, H and F are 1-dimensional complex manifolds as subsets of C. There is a standard way to give a manifold structure to the equivalence classes H/Γ. When this is done, the 1-to-1 correspondence between H/Γ and F is a complex manifold isomorphism. Hence, up to isomorphism, the set of equivalence classes H/Γ is F.
But wait a minute. We already noted that there are 1-to-1 correspondences between homothety classes of lattices, complex tori, equivalence classes of elliptic curves, and equivalence classes of H under the action of Γ. To this list we can now add "points of F". In some sense, which we will explore further, F is a kind of master index to equivalence classes of each of these other things, in that each point of F is a unique label for a whole equivalence class. F is also, in a natural way, a complex manifold with a nice topological structure. Topological spaces of this kind are sometimes called "moduli spaces". They have great theoretical value since in some sense they parameterize a whole class of objects, such as elliptic curves.
We noted before that in every homothety class, we can find many lattices L_{τ} with a basis pair of the form (τ,1) with Im(τ) > 0. The properties of Γ and its fundamental domain F imply that the choice of τ is unique if we require τ∈F. This has interesting consequences.
℘′^{2} = 4℘^{3} - g_{2}℘ - g_{3}satisfied by ℘(z). The coefficients of this equation are given explicitly by
g_{2} = 60G_{4} = 60∑′_{ω∈Lτ} ω^{-4} = 60∑′_{m,n∈Z} (mτ + n)^{-4}and
g_{3} = 140G_{6} = 140∑′_{ω∈Lτ} ω^{-6} = 140∑′_{m,n∈Z} (mτ + n)^{-6}(The primes on the summation signs are there as a reminder that the lattice point 0 is omitted from the sums.)
Sums of this form are called Eisenstein series, after Ferdinand Eisenstein (1823-52), who was among the first to study their properties. Obviously, these series are functions of τ, and in fact they can be shown to be analytic functions of τ for τ∈H. In fact, that is true more generally, if k ≥ 2 and we set
G_{2k}(τ) = ∑′_{m,n∈Z} (mτ + n)^{-2k}(We will see shortly why these are usually considered only for sums containing even powers.) These analytic functions have a very interesting property. Suppose L_{τ′} = λL_{τ} is a lattice homothetic to L_{τ}. Then τ′ = M(τ) for some M ∈ Γ, and
G_{2k}(M(τ)) = G_{2k}(τ′) = ∑′_{ω∈Lτ′} ω^{-2k} = ∑′_{ω∈Lτ} (λω)^{-2k} = λ^{-2k}∑′_{ω∈Lτ} ω^{-2k} = λ^{-2k}G_{2k}(τ)However, from earlier we had λ = 1/(cτ + d), and so
G_{2k}(M(τ)) = (cτ + d)^{2k}G_{2k}(τ)This property exhibits a type of symmetry of the functions G_{2k}(τ), and it is the defining characteristic of what are called "modular functions". It is hard to exaggerate the importance of such functions in the theory of elliptic curves.
You should be aware that the relevant terminology isn't completely standard. If you look at other references, you will often find slightly different forms of these definitions. That shouldn't present any real problem as long as you are careful to understand what definition is used in any given context.
To begin with, suppose that f(z) is a function that is meromorphic on the upper half plane H and that the following is true for some integer k:
f(M(z)) = f((az+b)/(cz+d)) = (cz+d)^{k}f(z) for all M∈Γ and z∈HNote that the special transformation T(z) = z + 1, corresponding to a matrix with a=b=d=1, c=0, is an element of Γ. It follows that for such a function, with any k∈Z, f(z+1) = f(T(z)) = f(z), so f(z) is periodic with period 1.
A basic fact about such functions is that they can be expressed in a "Fourier series" expansion like so:
f(z) = ∑_{n∈Z} a_{n}q^{n} for a_{n}∈C, and where q = e^{2πiz}Such a series is often called a q-expansion.
Next, suppose additionally that the Fourier series has a special form, with only finitely many a_{n} ≠ 0 if n < 0, so that in fact
f(z) = ∑_{N≤n<∞} a_{n}q^{n} for some N∈ZThis condition is sometimes expressed by saying f(z) is "meromorphic at infinity". Given all that, f(z) is defined to be a modular function for Γ "of weight k".
Some variations on this that you might see require k=0 (a modular function of weight 0 in the present terminology) or use some subgroup of Γ instead of the full modular group (resulting in a less restrictive definition).
Next, a modular form is a modular function whose Fourier series has coefficients a_{n}=0 for all negative n. This condition can also be stated as the requirement that f(z) be holomorphic on H (and at infinity). Finally, f(z) is said to be a cusp form if it is a modular form for which a_{0} = 0.
As an example, consider the Eisenstein series, which we observed to be modular functions. It isn't a difficult fact that they have the q-expansions (when k is even):
G_{k}(z) = 2ζ(k)[1 - (2k/B_{k})∑_{1≤n<∞} σ_{k-1}(n)q^{n}]In this expression, ζ(z) is the famous Riemann zeta function, σ_{k}(n) is the arithmetic function defined by sums of the k^{th} powers of divisors of n (including 1 and n itself):
σ_{k}(n) = ∑_{d|n} d^{k}and B_{k} are rational numbers known as the Bernoulli numbers, defined as coefficients of a power series:
x/(e^{x} - 1) = ∑_{0≤k<∞} B_{k}x^{k}/k!This representation of G_{k}(z) is intriguing since it is a product of a value of ζ(z) at a positive integer and a q-series whose coefficients are rational numbers. Whether or not this has any deeper meaning, it does show that the Eisenstein series are modular forms, not just modular functions, though they are not cusp forms.
Modular functions have been studied very extensively in their own right, apart from their relation to elliptic curves, since they have many applications in number theory and other parts of mathematics. We'll mention a few of the simpler facts about them.
As a first step in answering this, we observe that there is a simple sufficient condition for C/L ≅ C/L′. So suppose L and L′ = λL are homothetic. Let E = φ(C/L) and E′(C) = φ(C/λL). Finally suppose the equation of E in Weierstrass form is
y^{2} = 4x^{3} - g_{2}x - g_{3}and the equation of E′ is
y′^{2} = 4x′^{3} - g′_{2}x′ - g′_{3}Then when one works through all the algebra, it turns out that one equation comes from the other by a simple change of variables:
x′ = x/λ^{2} y′ = y/λ^{3}Furthermore, the coefficients of these equations are related as follows:
g_{2} = λ^{4}g′_{2} g_{3} = λ^{6}g′_{3}Now suppose ω_{1} and ω_{2} are a basis of L, &tau = ω_{1}/ω_{2}, and Im(τ) > 0. Let L′ = λL have basis ω′_{1} and ω′_{2} with τ&prime = ω′_{1}/ω′_{2}, and Im(τ′) > 0. To say that L′ = λL means we can write λ&omega_{1} = aω′_{1} + bω′_{2} and λ&omega_{2} = cω′_{1} + dω′_{2} for integers a, b, c, d.
If you know a little linear algebra, you will recognize that the coefficients a, b, c, d correspond to a 2×2 matrix
( |
| ) |
Several simple but important observations follow immediately. First,
τ = ω_{1}/ω_{2} = λω_{1}/λω_{2} = (aω′_{1} + bω′_{2}) / (cω′_{1} + dω′_{2}) = (aτ′ + b) / (cτ′ + d)This leads to several other observations. If we take λ = 1/ω_{2}, then L′ = λL is a lattice equivalent to L that has the basis (τ,1). Hence there is at least one lattice in every equivalence class that has a basis of the form (τ,1) with Im(τ) > 0. In fact, there are many possible bases of this form in each homothety class, as we shall see presently.
Unless you have already studied this subject thoroughly, things may seem pretty confusing by now. The primary interest here is elliptic curves, but we seem to have drifted off to talking about things like "lattices" and "complex tori", and to showing a few relationships between them. Of course, there is a method to the madness. Our goal now is to be able to talk not just about a single elliptic curve and its properties, but about the set of all elliptic curves. We will see that there is some structure and order to this set which describe the ways different curves are related to each other. The set of all lattices (over C) and the set of all complex tori have a similar structure to them, and the structure within is related.
So let's go back to the set of all lattices. For any particular lattice L, we know there is a basis consisting of two nonzero complex numbers ω_{1} and ω_{2}. We know that the ratio τ = ω_{1}/ω_{2} is not a real number, and the order of the two numbers can be chosen so that we may assume Im(τ) > 0. However, this choice of basis is hardly unique. Therefore, simply taking pairs of complex numbers which have a ratio whose imaginary part is positive doesn't give us a very helpful "label" to associate with the lattice.
SL_{2}(Z) is a very important group. It is (almost) a group called the "modular group", which will play an absolutely fundamental role in the theory of elliptic curves that we are leading up to. SL_{2}(Z) is a subgroup of SL_{2}(C), the group of all 2×2 matrices with entries in C and determinant 1. This group, in turn, is a subgroup of GL_{2}(C), the group of all 2×2 matrices with entries in C and nonzero determinant. (The condition on the determinants of the matrices in these groups is what ensures that the matrcies have inverses.) These are just a few simple examples of "Lie groups" -- essentially matrix groups that have related algebraic and topological structures. This theory is interesting, extensive, and deep, but we won't go into it further at this point.
Can we put further conditions on basis pairs to substantially reduce the number of distinct basis pairs for a given lattice L? That could be done, but it turns out not to be the right question to ask. Remember, we earlier looked at homothety classes of lattices -- all lattices of the form λL for some lattice L and λ∈C*. We can describe the set of homothety classes in terms of group actions. Here the underlying set S is the set of lattices, and G = C* is the multiplicative group of nonzero elements of C. The set of homothety classes is just the set of equivalence classes of s under the action of C*. The elements of any particular equivalence class are labeled by elements of C* -- a huge set.
The set of homothety classes of lattices is the right set to look at for our purposes, for several reasons. In the first place, we get the same (up to isomorphism) complex torus C/L for any L in the same homothety class. That's a good thing because of the further correspondence of complex tori to elliptic curves. But the set is nice to work with for another reason as well. If S is now the set of homothety classes, then we can define a new and different action of G = SL_{2}(Z) on this S. But first we have to define an action of SL_{2}(Z) on the upper half plane H = {z∈C | Im(z) > 0}.
It is customary to define a slightly different form of this:
Δ = -16(27b^{2} + 4a^{3}) = -2^{4}(27b^{2} + 4a^{3})
We say that two elliptic curves are isomorphic if they have defining equations which are the same under some change of coordinate system. Since we can always change coordinates to put the equation in the normal form, we only need to work with that form. However, that form still isn't quite unique - there are different equations in normal form that define isomorphic elliptic curves. In other words, there are coordinate transformations that change the coefficients but preserve the normal form. Such transformations thus lead to isomorphic curves which have different discriminants.
However, it turns out that the quantity
j = 12^{3} 4a^{3} / (4a^{3} + 27b^{2}) = -12^{3} (4a)^{3} / Δis invariant no matter what normal form of the equation is used. This is called the j-invariant of the elliptic curve. Two elliptic curves are isomorphic if and only if they have the same j-invariant. (The reason for the constant coefficient 1728 = 12^{3} is that j, being dependent on the lattice periods ω_{1} and ω_{2}, has an explicit formula in terms of them out of which 1728 falls out naturally.)
Although the discriminant of a defining polynomial isn't an invariant of an elliptic curve, it is close. It happens that there is a related quantity called the minimal discriminant that is invariant. If we consider all equations in normal form for the same elliptic curve, we can choose the one whose discriminant has the fewest distinct prime factors. That discriminant is the minimal discriminant.
The most important fact about the minimal discriminant is that the primes which divide it are precisely the ones at which the curve has bad reduction. In other words, except for those primes, the reduced curve is an elliptic curve over F_{p}.
There is still another invariant of an elliptic curve E, called its conductor, and often denoted simply by N. The exact definition is rather technical, but basically the conductor is, like the minimal discriminant, a product of primes at which the curve has bad reduction. Recall that E has bad reduction when it has a singularity modulo p. The type of singularity determines the power of p that occurs in the conductor. If the singularity is a "node", corresponding to a double root of the polynomial, the curve is said to have "multiplicative reduction" and p occurs to the first power in the conductor. If the singularity is a "cusp", corresponding to a triple root, E is said to have "additive reduction", and p occurs in the conductor with a power of 2 or more.
If the conductor of E is N, then it will turn out that N is the "level" of certain functions called modular forms (not yet defined) with which E is intimately connected.
We might add a few more words about the j-invariant. It is a complex number that characterizes elliptic curves up to isomorphism: two curves are isomorphic if and only if they have the same j-invariant. Not only that, but for any non-zero complex value, there actually exists an elliptic curve with a j-invariant equal to that value. So there is a 1-1 correspondence between (isomorphism classes of) elliptic curves and C*.
Now, we have already seen that an elliptic curve as a complex torus is essentially determined by the period lattice of the ℘ function that parameterizes the curve. More precisely, two tori are isomorphic if and only if their corresponding lattices are "similar", that is, if and only if one is obtained from the other by a "homothety", i. e. multiplication by a non-zero complex number.
But there is another way to characterize similar lattices. Suppose we have two lattices. Each has a Z-basis of the form {ω_{1}, ω_{2}}. Applying a homothety, we can just consider the period ratios and assume the two bases are {1, τ}, {1, τ′}, with both τ and τ′ in the upper half plane H = {z | Im(z) > 0}. These define the same lattice if and only if they are related by a transformation in SL_{2}(Z). This latter is essentially what is known as the "modular group" Γ. So there is a 1-to-1 correspondence of similar lattices and elements of H/Γ.
In summary, there are 1:1 correspondences between each of the following
Returning to the j-invariant, it is the 1:1 map between isomorphism classes of elliptic curves and C*. But by the above it can also be viewed as a 1:1 map j: H/Γ → C. j is therefore an example of what is called a modular function. We'll see a lot more of modular functions and the modular group. These facts, which have been known for a long time, are the first hints of the deep relationship between elliptic curves and modular functions.
In spite (or perhaps because) of his philosophical idiosyncrasies, Kronecker made fundamental contributions to number theory, as we shall see. In the part of the theory we're going to discuss, Kronecker was fascinated by a rather deep fact about algebraic numbers. In technical terms, this is the fact every "abelian" algebraic extension of the rational numbers Q is contained in an extension of Q generated by "roots of unity", a so-called "cyclotomic extension".
f(x) = a_{n}x^{n} + ... + a_{1}x + a_{0}where the coefficients a_{i} are in Q and a_{n}≠0, then we say that n is the degree of the polynomial. If f(x) has degree n and a_{n}=1, the polynomial is said to be "monic".
Algebraic integers are an important special case of algebraic numbers. By definition, an algebraic integer is the root of a polynomial equation f(x)=0 where all coefficients of f(x) are integers and f(x) is monic. This includes ordinary integers in the case that f(x) has degree 1. As defined, such algebraic integers have been found to be the most natural generalization of the ordinary integers Z. Many problems of Diophantine equations which require solutions to be found in Z (or possibly Q) are best analyzed in terms of algebraic numbers. This was very true, for example, with Fermat's equation x^{n} + y^{n} = z^{n}. There were various problems understanding how to deal rigorously with such equations until the properties of general algebraic numbers were well understood.
Q and C are examples of mathematical systems called fields, as we mentioned early on. Fields have two laws of composition which correspond to ordinary addition and multiplication. A field has a group structure with respect to both laws of composition, except that 0 (alone) lacks a multiplicative inverse. A ring is very similar to a field, except that not all ring elements need have multiplicative inverses. The integers Z make up a very typical example of a ring (a commutative ring, since the operation of multiplication is commutative in Z).
Algebraic numbers can always be regarded as being elements of some field F intermediate between Q and C, i. e. where Q⊆F⊆C. Any time two fields F and K are related such that F⊆K, one says that F is a subfield of K, and K is an extension field of F. Any algebraic number α is contained in some field F that is an extension of Q. There is always a smallest subfield F⊆C that contains α, in the sense that F is contained in any other field that also contains α. (C necessarily contains such a field, since the "fundamental theorem of algebra" says that C contains the roots of all f(x)∈Q[x].) This smallest field is written Q(α). It consists of all quotients f(α)/g(α), where f(x) and g(x) are polynomials in Q[x]. Q(α) is said to be obtained by "adjoining" α to Q.
The set of all algebraic integers that are contained in a particular field F also form a ring, called the ring of integers of F. Such rings are natural generalizations of Z, and make up a large part of the subject matter of algebraic number theory.
For a long time in the history of algebra, mathematicians hoped to be able to express the solutions of polynomial equations f(x)=0 by means of "radicals", that is, using expressions involving only the usual arithmetic operations plus extraction of roots (square roots, cube roots, etc.). Finally in 1824, Niels Henrik Abel (1802-1829) showed that not all equations involving polynomials of fifth degree or higher could be solved by radicals. This negative result was unfortunate. It had been found that all quadratic, cubic, and quartic equations could be solved by radicals. Although the general solutions of such equations could be unwieldy (especially for quartics), an inability to express some solutions using radicals at all was even more inconvenient for practical and theoretical computation alike. The abstract theory of algebraic number fields, however, eventually more than made up for lack of explicity solvability by radicals -- at least for theoretical purposes.
For both practical and theoretical reasons, mathematicians wanted convenient ways to express solutions of polynomial equations. One alternative which sometimes made up for the lack of expression of solutions by radicals was the use of "roots of unity" to express solutions. These were especially useful, for example, in the case of Fermat's equation. A root of unity, conventionally denoted by the symbol ζ (Greek zeta), is defined to be some solution of an equation having the simple form x^{n} - 1 = 0, for some integer n. 1 is always a solution, of course. -1 is a solution of x^{2} - 1 = 0. i = √-1 is a solution of x^{4} - 1 = 0. The fundamental theorem of algebra says that x^{n} - 1 = 0 always has n roots in C. In general, such roots may be repeated and therefore not distinct (which is the case of the equation (x-1)^{2} = 0, for example), but for x^{n} - 1 = 0 the roots are known to be distinct, and can in fact be expressed in the form ζ = e^{2πik/n}, for integers k, 0≤k<n. ζ is said to be a primitive n^{th} root of unity if n is the smallest integer such that ζ^{n} = 1. -1, for instance, is a fourth root of unity, but not a primitive fourth root of unity, like i.
What does it mean to raise the number e to any power, especially one involving arbitrary complex numbers? The answer is that it is done rigorously with an infinite series. In fact, the series
e^{z} = ∑_{0≤n<∞} z^{n}/n!can be shown to converge for any z∈C.
Suppose α is an algebraic number (not necessarily an algebraic integer). The question to be answered -- whether α is expressible in terms of n^{th} roots of unity for some n -- is equivalent to the question of whether α is a member of some field of the form Q(ζ), where ζ is a primitive n^{th} root of unity. Such a field is called a "cyclotomic field". However, this reformulation is somewhat tautological and doesn't really help to answer the basic question.
What Kronecker realized, and partially proved, was that an equivalent condition for α to be expressible in terms of n^{th} roots of unity was this: the "splitting field" F of the "minimal polynomial" f(x) of α should be an "abelian extension" of Q.
What do those new terms mean? The "minimal polynomial" of α is the monic polynomial f(x)∈Q[x] of least degree such that f(α) = 0. The "splitting field" of f(x) is the smallest extension F⊇Q that contains all the roots of f(x) (including α itself). Finally, F is an "abelian extension" of Q if its "Galois group" G(F/Q) is an abelian group.
What is a Galois group? That's a longer story -- Galois theory, which is fascinating and not really all that difficult, but it takes a fair amount of explanation anyhow. We offer a fuller explanation elsewhere. But essentially a Galois group is a group of permutations of the roots of a polynomial f(x) that induce an automorphism of the splitting field of f(x). Although f(x) has as many roots as its degree, not all permutations of those roots necessarily induce an automorphism of the splitting field. The Galois group G(F/Q) thus encodes information about the structure of intermediate fields E such that Q⊆E⊆F.
What the Kronecker-Weber theorem does for us is give a more computable way of determining whether an algebraic number α can be expressed in terms of n^{th} roots of unity. The procedure is roughly: find the minimal polynomial f(x) for α and determine whether the Galois group of the splitting field of f(x) is abelian.
The Kronecker-Weber theorem can also be stated simply as a theorem about fields: Every abelian extension of Q is contained in a cyclotomic field. This form is more convenient for exploring generalizations. In particular, if you take Q and adjoin all n^{th} roots of unity for each n, the resulting field will be an abelian extension of Q which is the maximal abelian extension of Q, because it contains all other abelian extensions.
The generalization involves considering abelian extensions of a field other than Q. Specifically, are there other fields F for which one can say that all abelian extensions of F must be contained in some relatively simple and easily described type of field? The answer embodied in the Jugendtraum is yes -- if the field F is an imaginary quadratic extension of Q -- in which case all abelian extensions of F will be contained in an extension of Q generated by certain values of elliptic functions.
This is where we return to the theory of elliptic curves. We shall need to consider a class of elliptic curves which have a very special property called "complex multiplication". Suppose in this section that E is an elliptic curve that is isomorphic to the complex torus C/L, where L has basis ω_{1} and ω_{2} and τ = ω_{1}/ω_{2}.
For any n∈Z, note that nL⊆L. Unless n=±1, clearly nL≠L. In fact, for any real number t≠±1, tL≠L. Our previous discussion of homothety raises the question of whether there are any lattices L that are homothetic to themselves, with (nonreal) λL=L for some λ∈C. More generally, under what conditions is λL⊆L if λ is not an integer?
This can be answered with a simple computation. Assume λ∉Z. If λL⊆L, then there are integers a, b, c, d such that
λω_{1} = aω_{1} + bω_{2} and λω_{2} = cω_{1} + dω_{2}and so
λτ = aτ + b and λ = cτ + dMultiplying the second equation by τ and subtracting the first equation gives
cτ^{2} + (d-a)τ - b = 0We know c≠0, for otherwise λ=d, contrary to the assumption λ∉Z. And so τ must be an element of the field Q(√D), where D = (d-a)^{2} + 4bc, by the quadratic formula. τ can't be real, so we must have D<0, and τ must be a quadratic imaginary number.
The equation λ = cτ + d says that λ is also a quadratic imaginary number in Q(√D), and in fact similar manipulations show that λ satisfies
λ^{2} - (a+d)λ + ad - bc = 0So λ is actually an algebraic integer in Q(√D).
Whenever we have the situation λL⊆L with λ∉Z, we say that the elliptic curve corresponding to C/L has "complex multiplication". In such a case, the isomorphism φ: E → C/L can be used to define an analytic map f: E → E of the curve to itself by the rule f(z) = φ^{-1}(λφ(z)). Such a map is called an "endomorphism" of E. Endomorphism maps are heavily used in the theory of elliptic curves and generalizations.
In this terminology, we've shown that if E has complex multiplication by λ∉Z, then both λ and the period ratio τ lie in a quadratic imaginary field Q(√D), and λ is actually an integer of the field.
What about a converse? Suppose Q(√D) is a quadratic imaginary field. Then for any τ∈Q(√D) but τ∉Q, aτ^{2} + bτ + c = 0 with integers a, b, c, and a≠0. It follows that (aτ)τ = -bτ - c. Let λ = aτ. Then for any lattice L with basis ω_{1} and ω_{2} and τ = ω_{1}/ω_{2},
λω_{1} = -bω_{1} -cω_{2} and λω_{2} = aω_{1}and so λL⊆L. Hence if E corresponds to C/L, it has complex multiplication by λ.
The net result is that imaginary quadratic fields and elliptic curves have a "special" kind of relationship. Kronecker noticed this, and believed that he could generalize his earlier theorem about abelian extensions of Q to (a conjecture about) abelian extensions of imaginary quadratic fields. In this generalization he used special values of elliptic functions instead of special values of the exponential function. Kronecker made this conjecture in 1860, but he couldn't fully prove it. Weber completed the proof in 1891.
For quadratic curves (conics) there is a theorem that goes back to Legendre for effectively deciding whether a rational conic has a rational point. This has been given a much more elegant form by Hasse which states that the question can be answered by (relatively easy) tests for solutions in R and modulo p for all primes p -- the "Hasse principle". There are counterexamples to show that this doesn't work for cubic curves.
Mordell's theorem says that the group of rational points E(Q) of an elliptic curve over Q is finitely generated. (Though the group might be trivial, consisting only of the identity, a single "point at infinity".) The group therefore is the product of a finite group E(Q)_{t} (the torsion subgroup, of points having finite order) and 0 or more infinite cyclic groups (copies of Z). The rank is defined as the number of copies of Z, a finite number.
The Nagell-Lutz theorem shows that there is an effective algorithm to compute the rational points of finite order in E(Q), so it can be effectively determined whether such rational points exist. But there may be no nontrivial points of finite order, and so the hard problem, for which a solution is not known, is how to determine effectively whether a curve has nonzero rank.
Not much is known theoretically about the rank either. The conjecture of Birch and Swinnerton-Dyer is the main idea of interest in that direction. Very little is know about what values are possible for the rank of an elliptic curve. Special cases which have been computed are all very small. It is not known which integers can be the rank of an elliptic curve or even whether the rank can be arbitrarily large.
As of 2000, elliptic curves have been discovered whose rank must be at least 24, but not for any larger number.
The conjecture has been partially verified. Namely, it has been proven that L(E,1) ≠ 0 implies E has rank 0, so E(Q) is finite. It has also been shown that if L(E,s) has a zero of order 1 at s=1, then the rank of E is 1. From these facts it follows that if E(Q) is finite, then L(E,s) can't have a zero of order 1 at s=1, but for all we know it could have a higher order zero. Also, if E has rank 1, then L(E,s)=0, but the order of the zero could be higher than 1.
Although elliptic curves are known that must have rank at least 24, no elliptic curve has yet been found that has a zero of order higher than 3 at s=1. Curves for which L(E,s) is proven to have zeros of order 1, 2, or 3 are known, and (needless to say?) in these cases the actual rank of the curve is consistent with the conjecture.
The sharpest form of the conjecture actually gives a formula for the value of lim_{s→1} L(E,s)/(s-1)^{r} if the rank of E(Q) is r (i. e., the coefficient of the term involving (s-1)^{r} in the Taylor series expansion of L(E,s)). All the terms in this formula are fairly well understood, except for one, which is thought to be the order of a finite group, the Tate-Shafarevich group Ш(E/Q).
The precise definition of Ш(E/Q) is rather technical. But it may be considered to be a kind of a measure of the degree to which the Hasse principle fails to be true. This principle says that the nature of the group of rational points E(Q) should essentially be determined by the nature of the points E(F_{p}) of the curve over the finite fields F_{p} for all primes p.
A few facts are known about Ш(E/Q). For example, it is known to be finite if L(E,s) has a zero of order at most 1 at s=1. However, although no examples are known where Ш(E/Q) isn't finite, neither are any examples known where it is if L(E,s) has a zero of order more than 1 at s=1.
Truth of the Shafarevich conjecture is a necessary but not sufficient condition for the strong form of the Birch and Swinnerton-Dyer conjecture. Its truth, however, is sufficient for a weaker conjecture, known as the parity conjecture. There is a number w_{E} which occurs in the functional equation of L(E,s) and has the value ±1. It follows from that equation that the order of the zero of L(E,s) at s=1 is even or odd according as w_{E} is +1 or -1. The parity conjecture say that the rank of E is even or odd in the same way.
Copyright © 2002 by Charles Daney, All Rights Reserved