Nowadays, we take encryption on the internet for granted, but have you ever wondered how it works? Advances in cryptography since the mid-20th century have relied on an area of mathematics called modular arithmetic to guarantee the security of everything from your WhatsApp conversations and credit card information.
Though modular arithmetic is rather obscure, we actually use it every day—primarily when we tell time. And if you’re a programmer, you have probably used the %
operator before. However, there is much more to modular arithmetic, and the way it seems to turn normal math on its head is very fascinating.
In this post, I’ll cover the very basics of modular arithmetic.
Counting in a circle
Modular arithmetic can be described as “counting in a circle.” It is also called “clock arithmetic,” which gives a hint as to how the concept works. We all know that if you go somewhere at 2 o’clock for 4 hours, it will be 6 o’clock when you return, but if you go somewhere at 11 o’clock for 4 hours, it will be 3 o’clock when you return—not 15 o’clock (assuming you use the 12-hour clock).
This is because the numbers of the clock wrap around:
We can imitate this in Clojure with the cycle
function (don’t actually run this, though, because it will go on forever and lock up your REPL):
(cycle (range 1 13)) ;; 13 because range excludes the endpoint
If you are a musician, you can relate this to many areas of music as well: the names of the notes in the scale, such as ABCDEFG—or sa re ga ma pa dha ni, if you like—repeat indefinitely. (That doesn’t mean the notes themselves are the same; more on that later.) There is no such note as “J,” for example.
In fact, it is predominantly only Western culture and music that conceptualizes music (and time) linearly. For example, Indian classical music, flamenco, and Javanese gamelan music are three forms of world music that are well-known for their explicit use of cyclic representations of form and rhythm. But that’s going beyond the scope of this post 😅
Back to numbers. If we stretch the clock cycle out into a number line, you will never encounter a number greater than 12 no matter how far you go along the line. Or rather, once you go beyond 12, you find that the result is 12 less than the “normal” number:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20...
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8...
Reduction modulo \(n\)
As the time-telling example shows, addition in modular arithmetic is similar to regular old addition, except with one extra step. If the result is greater than our upper limit (in this case 12), then we can keep subtracting the limit until we get a result that is less than the limit.
To be more precise, this upper limit is called the modulus, and our final answer represents the remainder after dividing the initial result by the modulus. The transformation of our initial result into the final answer is called reduction modulo \(n\).
Ordinary arithmetic:
Modular arithmetic:
because 12 goes into 15 one time and leaves a remainder of 3. In other words, the quotient is 1 and the remainder is 3.
Remainders, then, are absolutely central to modular arithmetic. Nowadays, remainders and long division are things that basically nobody bothers with after elementary school, but evidently Gauss was fascinated enough with remainders to use them as the basis for an entire mathematical treatise, the Disquisitiones Arithmeticae. And that’s why we have modular arithmetic and ultimately, now, encrypted WhatsApp messages 😄
But wait—what’s that \(\equiv\) sign? Does that mean two things are “even more” equal? No, \(\equiv\) indicates congruence and is read “is congruent to.” Two numbers are congruent to each other if they have the same remainder after division by a given modulus.
I’ll come back to that in a bit. But first, some code.
When %
is not %
If you have used the %
operator in programming before, note that the mathematical way of writing modular arithmetic statements is a little different. The math notation focuses on statements of congruence, and treats \(\mod n\) as a prepositional phrase. That is, \(a \equiv b \mod n\) communicates the idea that \(a\) is congruent to \(b\) in the “world” of modulo \(n\).
Programming languages, on the other hand, approach modular arithmetic with an emphasis on the operation (action) of reducing a number modulo another number, where a % n
means reduce a
(in the world of) modulo n
.
Instead of declaring congruence, you would perform a Boolean test for it. In Python, \(a \overset{?}{\equiv} b \mod n\) becomes a % n == b
, which returns True
or False
. (I am including Python here because it follows the convention of using %
, like most programming languages.)
In Clojure, the symbol %
is reserved for variables in anonymous functions, so mod
is used instead.
>>> 3 % 12 # This returns the remainder after division
3
>>> (11 + 4) % 12
3
>>> 15 // 12 # This means integer division and returns the quotient
1
>>> (11 + 4) % 12 == 3
True
(mod 3 12)
=> 3
(mod (+ 11 4) 12)
=> 3
(quot 15 12) ;; Integer division in Clojure
=> 1
(= 3 (mod (+ 11 4) 12))
=> true
;; Using the arrow to reorder the expressions can let you write them
;; a bit more like you'd write them in imperative languages, hence easier to read
(-> 11 (+ 4) (mod 12))
=> 3
Congruence
In our example above, we saw that not only are 3 and 15 congruent with each other, but by the definition of congruence, so is every number of the form \(12k + 3\), where \(k\) is an integer. Thus, all of the following are congruent:
We can see that this is true by reducing all of those numbers mod 12:
(map #(mod % 12) [-24 -9 3 15 27 39 51])
=> (3 3 3 3 3 3 3)
The infinite set of integers that share this property is called the congruence class of \(3 \mod 12\). An alternative name is residue class, residue being synonymous with remainder.
(Note that negative integers are perfectly allowed in modular arithmetic, although in practice, most work is done on non-negative integers only. Negative moduli, on the other hand, are not a thing.)
What this means is that \(\mathbb Z\), the integers, can be sorted into different “boxes” based on their remainder \(\mod n\).
Returning to the music analogy, all notes belong to a pitch class. For example, C is a pitch class. On the piano, all notes produced by the white key just to the left of the pair of black keys belong to the pitch class C and are therefore a C, despite belonging to different octaves. The actual frequencies that result differ (by multiples of two, to be exact), but functionally, they are treated equivalently. You could say that the lowest C on the piano is congruent to the highest C.
Similarly, for every \(0 \le a < n\), there is a unique congruence class \(\bar{a}_n\), also notated \(\lbrack a \rbrack_n\) or \(\bold a_n\) (though the latter notation could cause confusion with vectors), which can be defined in set-builder notation as
That is, the congruence class \(\bar{a}_n\) is the set of integers \(z\) given by \(kn + a\) for all integers \(k\).
While no programming language can actually process infinite sets, at least Clojure gives you a way to write them. I’m leaving this here merely to show the Clojure translation of this mathematical definition. If you try to run it, it will freeze your REPL. You’ve been warned.
(defn congruence-class-posint
"Returns an infinite sequence of the positive integers belonging to
the congruence class a mod n."
[a n]
(map (fn [k] (-> (* k n) (+ a)) (range))))
(defn congruence-class
"Returns an infinite sequence of all integers belonging to the congruence class a mod n."
[a n]
(mapcat (fn [k] [(-> (* k n) (+ a))
(-> (* (- k) n) (+ a))] ;; We need negative k values as well!
(range))))
\(\mathbb{Z}_n\)
Actually, the whole congruence class notation is just a mathematical formality. In practice, any single member of a given congruence class can stand in for the entire class.
Bearing in mind that \(n \equiv 0 \mod n\), this effectively means that in the “world” of modulo \(n\), we only have to consider the integers from \(0\) to \(n - 1\). (The clock analogy breaks down a little bit here because in real life, we use 1 through 12 rather than 0 through 11, but it is still a good starting point.)
Since this is just an introduction, I’ll skip the bit about residue systems and leave the discussion of ring theory for another post.
We can use the symbol \(\mathbb{Z}_n\) to denote this subset of the integers. In code, it’s equally simple:
(set (range n))
Now, what do we do with these numbers?
Addition, subtraction, and negative numbers
Addition, as we have already seen, is basically the same as in normal arithmetic, except the numbers wrap around. Outside the context of telling time, you might be surprised to get a “sum” that is smaller than the numbers you added together.
The same is true for subtraction:
The “difference” we obtained is greater than the terms we started with.
Also note that negative numbers now denote counting down from \(n\) rather than \(0\). If you are familiar with Python, this should make intuitive sense to you, since you can access the last n
th item in a list by writing my_list[-n]
.
Multiplication and efficiency
Multiplication also works the same as in normal arithmetic, although again, it can be surprising to obtain a product that is smaller than what you started with.
Normally, when dealing with even somewhat large numbers, products can ”explode” fairly quickly. That is, it doesn’t take much for multiplication to quickly result in really large numbers that can slow down computation, even for a computer.
If you are working in a more bare-bones language such as C++ or JavaScript, you might find it need to implement a tried-and-true modular multiplication algorithm such as Montgomery multiplication.
However, I have found that higher-level languages like Python and Clojure come “batteries included” in this respect, so I won’t go into detail about the low-level implementation. You might want to write a helper function to make modular multiplication more convenient to write, though.
(defn mod-mul
[a b m]
(-> (*' a b) (mod m)))
;; the ' means avoid integer overflow by using bigintegers if necessary
(mod-mul 78 99 123)
=> 96
Order of operations
Because each integer in \(\mathbb{Z}_n\) is a representative of its entire congruence class, this means that in an expression like \(a + b \mod n\), either \(a\) and \(b\) (or both) can be reduced before adding if it makes the calculation easier.
And indeed,
This isn’t really that useful for addition and subtraction, but for iterative processes like multiplication (which is just repeated addition) and exponentiation (which is just repeated multiplication), reducing the result after each iteration can speed up computation dramatically, because computers prefer working with smaller numbers.
Specifically, in Java (on which Clojure is based), the maximum value for an int
eger is 2147483647
and the maximum value for a long
integer is 9223372036854775807
. These are the two so-called primitive data types for integers, which yield optimal computational performance.
If you want a bigger number, you need to use either Clojure’s bigint
type (or use the auto-promoting +' -' *' inc' dec'
to safeguard against overflow) or Java’s BigInteger
type. However, computers treat biginteger
s as objects rather than numbers, so they have to work a bit harder (or use more memory) to manipulate them.
(Side note: To coerce a number to BigInteger
, use the native Clojure function biginteger
. Camel-cased BigInteger.
with a period, is a Java invocation you can use to create an instance of a BigInteger
object. This opens the doors to other features unavailable in the Clojure standard library, such as generating a random integer that is larger than the maximum long
value. More on this in upcoming posts.)
Going beyond
Of course, there is more to arithmetic than just addition, subtraction, and multiplication. What about the rest of the primitive operations?
- Division: So far, you may notice that I have left out division. That’s because unlike addition, subtraction, and multiplication, there is no ”division” in modular arithmetic, at least not in the everyday sense. I will leave that for the next post.
- Exponentiation: I also hinted at modular exponentiation. Since it is merely repeated multiplication, the mechanics of it are not hard to understand. However, it opens the door to a whole host of interesting larger mathematical concepts that are worth at least a couple more posts.
- Roots: On the other hand, “undoing” exponentiation by taking a root \(\mod n\) is much, much more involved than you would expect, given how simple exponentiation is. This works quite differently than roots in regular arithmetic.
- Logarithms: Going one step further, logarithms in modular arithmetic, also called discrete logarithms, are even more different and difficult than their normal counterparts, although the underlying concept remains the same. However, it is precisely this difficulty that ensures the security of modern math-based encryption schemes.
Stay tuned for upcoming posts in this series!
Go Top