A Primer on Modular Forms for Mathematicians

>There are five elementary arithmetical operations: addition, subtraction, multiplication, division, and… modular forms. >$\qquad$-[Martin Eichler](https://en.wikipedia.org/wiki/Martin_Eichler), apocryphal ## Significance The notion of modularity is fundamental in modern mathematics, and especially in number theory. For example, it is no accident that the celebrated Fermat's Last Theorem --- also called the Shimura-Taniyama conjecture --- is known in modern parlance as the [Modularity Theorem](https://mathworld.wolfram.com/Taniyama-ShimuraConjecture.html), proven by [Andrew Wiles](https://www.britannica.com/biography/Andrew-Wiles) by showing the $L$-functions of semistable elliptic curves correspond to modular forms. In another striking example, modularity was also key to the recent [Fields-medal winning proof determining optimal sphere packing in 8 and 24 dimensions](https://www.ams.org/publications/journals/notices/201702/rnoti-p102.pdf) by [Maryna Viazovska](https://people.epfl.ch/maryna.viazovska?lang=en). Even the celebrated [Riemann Hypothesis](https://www.claymath.org/millennium/riemann-hypothesis/), perhaps the most well known open problem in mathematics, is a statement about modular forms. This is because the Riemann $\zeta$ function is the Mellin transform of a half-integer weight modular form. Ultimately, it seems that modular forms know more than they should. Their coefficients encode the number solutions to elliptic curves modulo primes, determine the bulk of what we know about integer partitions, and count representations of integers by quadratic forms, among many, many other properties. Modular forms provide geometric invariants for quadratic number fields, parameterize isomorphism classes of elliptic curves, and have a great, great deal to say about representation theory. What's more, since modular forms generally live in explicit finite-dimensional spaces, finding a modular connection to a problem you're working on gives you access to powerful computational techniques. ## Definition Recall that the only conformal self-maps of the extended complex plane $\mathbb C\cup\{\infty\}$ are the Möbius fractional linear transformations $z\mapsto \frac{az+b}{cz+d}$ where $\left(\begin{smallmatrix}a&b\\c&d\end{smallmatrix}\right)\in\operatorname{GL}_2(\mathbb{C})$ the group of invertible matrices. Note that if we restrict $a,b,c,d\in\mathbb{R}$ with $ad-bc>0$ then $\operatorname{sign}\Im \left(\frac{az+b}{cz+d}\right)=\operatorname{sign}\Im (z).$ Hence the map $\left(\begin{smallmatrix}a&b\\c&d\end{smallmatrix}\right)z\mapsto \frac{az+b}{cz+d}$ is an orientation-preserving action of $\operatorname{GL}_2^+(\mathbb R)$ on $\mathbb C$. Moreover, we have the following powerful theorem [Stein & Shakarchi Theorem 8.2.4](https://mathscinet.ams.org/mathscinet-getitem?mr=1976398) demonstrating the fundamental importance these fractional linear transformations have for the complex upper-half plane $\mathfrak{H}:=\{z:\Im(z)>0\}$: **Theorem** If $f:\mathfrak H\to\mathfrak H$ is a conformal bijection then there exists a $\gamma\in \operatorname{Sl}_2(\mathbb{R})$ such that $f(z) = \gamma z$ for all $z$. Hence the set of all automorphisms of $\mathfrak H$ is the set of fractional linear transformations from $\operatorname{Sl}_2(\mathbb{R})$. But $\operatorname{Sl}_2(\mathbb R)$ is too big a set to consider. For starters, the orbit of a point $z\in\mathfrak H$ under $\operatorname{Sl}_2(\mathbb{R})$ has limit points, so an analytic function which is invariant (in an appropriate sense) under this action will be constant by the identity principle. Hence, what we really want is a group with a discrete action on $\mathfrak H$. A natural group to consider is $\operatorname{Sl}_2(\mathbb{Z})$. Now, $\operatorname{Sl}_2(\mathbb{Z})$ has several nice properties, one of which is that it's generated by only $2$ matrices, $T = \left(\begin{smallmatrix}1&1\\0&1\end{smallmatrix}\right)$ and $S = \left(\begin{smallmatrix}0&1\\-1&0\end{smallmatrix}\right)$. It is also discrete in the subspace topology of $\mathbb{R}^4$ and has a discrete action on $\mathfrak H$ with no limit points. This leads us to our definition of modular functions. **Definition** A *modular function*[^2] for a subgroup $\Gamma\leq \operatorname{Sl}_2(\mathbb{Z})$ is a meromorphic function $f:\mathfrak H\to \mathfrak H$ such that $f(\gamma z) = f(z)$ for all $\gamma\in\Gamma$ and all $z\in \mathfrak H$. We write $M_0(\Gamma)$ for the vector space of modular functions with respect to $\Gamma$. Importantly, there is a fundamental domain for the action of $\operatorname{Sl}_2(\mathbb{Z})$ on $\mathfrak H$, which is a region for which we have complete knowledge of all modular functions if we just know them on this region. A standard representation of this fundamental domain is $\{z\in\mathbb{H}:\Re z\in[-1/2,1/2],|z|\geq 1\}$, but the action of $\gamma\in \operatorname{Sl}_2(\mathbb{Z})$ on this standard representation gives an equivalent copy, which is a hyperbolic triangle, as plotted by [Mathematica below](https://resources.wolframcloud.com/FunctionRepository/resources/ModularTessellation/). ![[image.png|350]] Each copy of the fundamental domain is a representation of the *modular curve* $Y_0(1):=_{\operatorname{Sl}_2(\mathbb{Z})}\textbackslash^{\mathfrak H}$ , the quotient space of $\mathfrak H$ by the left action of $\operatorname{Sl}_2(\mathbb Z)$. This is very nearly a manifold. If we compactify it (in the standard representation, by adding the point at $i\infty$) then we get the compactified modular curve $X_0(1)$. There are some issues here, as the orbit of $i\infty$ under $\operatorname{Sl}_2(\mathbb{Z})$ is $\mathbb Q\cup\{i\infty\}$, but we will sweep them under the rug for the purpose of our demonstration, except to say that these points are related to what we call *cusps* of the modular curve. We can now introduce an interesting mathematical fact. If $f$ is a sufficiently smooth function (for now, holomorphic on $\mathfrak H$ will suffice) such that $f\left(\left(\begin{smallmatrix}a&b\\ c&d\end{smallmatrix}\right)z\right) = (cz+d)^2f(z)$ for all $\left(\begin{smallmatrix}a&b\\ c&d\end{smallmatrix}\right)$ in $\operatorname{Sl}_2(\mathbb Z)$, then $f(z)dz$ is invariant under the action of $\operatorname{Sl}_2(\mathbb{Z})$, so is a differential form on $Y_0(1)$. This motivates the following definition. **Definition** $f:\mathfrak H\to\mathfrak H$ is a *modular form* of weight $k\in 2\mathbb{Z}$ for the subgroup $\Gamma\leq \operatorname{Sl}_2(\mathbb{Z})$ if: 1. $f\left(\left(\begin{smallmatrix}a&b\\ c&d\end{smallmatrix}\right)z\right) = (cz+d)^kf(z)$ for all $\left(\begin{smallmatrix}a&b\\ c&d\end{smallmatrix}\right)\in\Gamma$, 2. $f$ is holomorphic on $\mathfrak H$, and 3. $f$ is constant as you approach the cusps of $\Gamma$, that is, the equivalence classes of $\mathbb Q\cup\{i\infty\}$ under $\Gamma$. When $f$ is a modular form of weight $k$ for $\Gamma$ we write $f\in M_{k}(\Gamma)$. If instead $f$ vanishes towards the cusps, then $f$ is a *cusp form* and we write $f\in S_{k}(\Gamma)$. Our previous computation reveals that $f$ a modular form of weight $k$ implies that $f(z)dz^{k/2}$ is a differential $k/2$-form. ## Example Let us consider one of the classic examples of a modular form: the Eisenstein series for $k\in 2\mathbb{Z}$ $E_k(z): = \sum_{(m,n)\in\mathbb Z^2\setminus\{(0,0)} \left(\frac{1}{mz+n}\right)^k.$A couple of elementary computations show that, for $T$ and $S$ the generators of $\operatorname{Sl}_2(\mathbb{Z})$ above, we have (assuming absolute convergence and interchangeability of sums) $E_k(Tz) = E_k(z+1) = E_k(z)$ and $E_k(S z) = E_k\left(\frac{-1}{z}\right)= z^k E_k(z).$ Hence $E_k\in M_k(\operatorname{Sl}_2(\mathbb Z))$, so long as we can prove the analytic and cusp conditions. This is actually somewhat interesting. First, note that the above computations do not work for $k\leq 2$,[^1] but the double sums involved do converge for $k\geq 4$, satisfying our analytic condition on $\mathfrak H$. Now, inspecting the transformation for $E_k(Tz)$, we see that $E_k$ is a $1$-periodic function and thus has a Fourier expansion. Traditionally, we write $q = e^{2\pi i z}.$ Hence $E_k$ has a $q$-series expansion. A computation (involving representations of Bernoulli numbers $B_k$) reveals that $E_k(z) = 1-\frac{2k}{B_k}\sum_{n>0}\sigma_{k-1}(n)q^n,$ where $\sigma_{j}(n):=\sum_{d\mid n}d^j$ is the $j$-th divisor sum. Notice that, in this form, it is easy to see that $\lim_{y\to \infty}E_{k}(x+iy) = 1$ and, since the only cusp of $\operatorname{Sl}_2(\mathbb{Z})$ is $\mathbb{Q}\cup\{i\infty\}$, this shows that $E_k$ satisfies the cusp condition and $E_k\in M_k(\operatorname{Sl}_2(\mathbb{Z}))$. **Remark** This computation demonstrates a few important features of the theory of modular forms. 1. the element $T$ plays a fundamental role. We call subgroups containing $T$ or one of its powers *congruence subgroups,* and most of the theory of modular forms takes place on these subgroups. This is because this condition guarantees that we can study the Fourier coefficients of modular forms. 2. The cusp condition on congruence subgroups is closely connected to the Fourier coefficients of the modular form, and you can tell if a function satisfies the cusp condition by inspection if you know its Fourier expansion at each cusp. 3. Finally, the Fourier coefficients of modular forms contain rich arithmetic information, as we see from the fact that the Fourier coefficients of Eisenstein series are divisor sums. [^1]: The case $k=2$ is very important and interesting; it just fails to converge absolutely, but this turns out to be very important because the "error to modularity" of $E_2$ exactly matches the error to modularity of the derivatives of modular forms. Hence $E_2$ is important for the theory of differential operators on spaces of modular forms, and is dubbed a *quasimodular form.* [^2]: For an application of modular functions to industry, see [[A Primer on Modular Forms for Non-Mathematicians#Application of Modular Functions]]