Math E-122: Algebra & Cryptography
Juspreet Singh Sandhu
May-13, 2018
Abstract
Cryptography is a topic that has fascinated humanity for time immemorial.
From the Atbash
1
cipher in ancient times to the Enigma
2
machine in World-
War I/II, Cryptography has been a central tool in intelligence gathering, secret
sharing, and more generally, in Information Privacy. In the modern day and age,
in addition to having numerous such applications, Cryptography is a rich field of
intellectual inquiry with deep connections to Abstract Algebra, Computational
Complexity Theory, Number Theory and Lattice Theory. This survey begins
with a brief (but formal) introduction to Ciphers, followed by an introduction
to Symmetric & Asymmetric Key Cryptography. We then give a detailed expo-
sition of the El-Gamal & RSA protocols, and conclude with formal proofs for
the security guarantees of each system by analyzing the computational hardness
of the underlying problems (Discrete-Log, Inverse Modular Exponentiation) us-
ing traditional results from Group Theory (Euler’s Theorem, Fermat’s Little
Theorem).
1 Ciphers, Symmetric & Asymmetric Key Cryp-
tography
We begin by defining a Cipher:
Def
n
: A Cipher is a 2-tuple E = (E, D) where:
E : K × M C is an efficient Encryption Algorithm.
D : K × C M is an efficient Decryption Algorithm.
D(E(k, m)) = m, k K and m M.
K denotes the key-space, the set from which a key is chosen. M denotes the
message-space, the set from which a message is chosen. Finally, C denotes the
cipher-space, the set into which messages are encrypted.
In the early days of Cryptography, most Encryption and Decryption protocols
relied on shared secrets in the form of (secret) keys. This meant that prior to
1
encrypted communication, both parties needed to agree on the same key. This
lead to the development of Symmetric Key Cryptography, with many protocols
implementing the idea of symmetric key generation. An example of such a pro-
tocol is the famous One-Time Pad
3
.
Def
n
: The One-Time Pad is defined by symmetric encryption and decryption
protocols of XOR-ing bits, given a random key k K. More formally:
OTP = (E = (k m), D = (k c)), where, k, m, c {0, 1}
n
.
It is easy to see that the OTP upholds the invariant mentioned earlier:
D(E(k, m)) = k E(k, m) = k (k m) = (k k) m = 0 m = m.
However, the One-time Pad posed weaknesses such as the lack of ability to
re-use keys without loss of security, and the need to have keys as long as the
message. Additionally, the new key that was generated for each communication
needed to be conveyed securely between both communicating parties, leading
to a cyclical problem of security. All these constraints showed the weaknesses
of Symmetric Key Cryptography. Addressing these weaknesses lead to the new
age of Asymmetric Key Cryptography, where the notion of a secret-key and a
public-key emerged. The idea was to have two parties that wanted to commu-
nicate a secret retain a secret-key before communication and then regenerate
numerous public-keys which could be used to compute shared-secrets. The se-
curity provided by such protocols came from the underlying ”one-way” compu-
tational hardness of the group-theoretic problems that formed the backbone of
the protocols. Put another way: These problems were efficiently solvable by the
communicating parties, but their inverses were hard to solve, which was made
the task for an adversary hoping to reverse-engineer the system. We analyze
two such cryptosystems: El-Gamal
4
and RSA
5
.
2 El-Gamal Cryptosystem
The El-Gamal Cryptosystem is a method to generate public-keys using a Mul-
tiplicative, Cyclic (Abelian) Group of very large (prime) order. The hardness
of extracting the secret key for such a protocol, given an adversary, comes from
the hardness of the Discrete-Log Problem
6
. The Discrete-Log Problem is
known to be in BQP (the class of problems computable in Polynomial Time
by a Quantum Computer), and in NP (the class of problems whose positive
solution is verifiable in Polynomial Time by a Classical Computer) co-NP
(the class of problems whose negative solution is verifiable in Polynomial Time
by a Classical Computer).
Prerequisites
G
p
, where p is a prime. G
p
is cyclic, multiplicative and abelian. Notice
that G
p
=
Z
×
p
.
g G
p
, such that, g = G
p
. Hence, g is a generator of G
p
. Notice that
2
this means ord(g) = ord(G
p
) = p 1. Therefore, g
ord(g)
= g
p1
= 1.
For the sake of computational purposes, we assume that p is large. It
is feasible to generate large primes in poly(|bits|(p)) (polynomial in the
number of bits of p) time as Primality-Testing
7
is known to be in P.
With this setup, we can define the El-Gamal protocol between two parties (say,
Alice & Bob) to generate public keys in a secure manner given the prerequisites.
El-Gamal Protocol (Encryption)
Alice: Chooses x G
p
= Z
×
p
= {1, .., p 1}, where, x =secret-key
Alice
.
Alice: Computes g
x
and sends over (G, p, g
x
, g) to Bob.
Bob: Chooses y G
p
= Z
×
p
= {1, .., p 1}, where, y =secret-key
Bob
.
Bob: Computes g
y
and sends over (G, p, g
y
, g) to Alice.
Alice: Computes (g
y
)
x
= g
yx
= g
xy
(abelian) using secret-key
Alice
.
Bob: Computes (g
x
)
y
= g
xy
using secret-key
Bob
.
Bob: Maps message m m
bob
G
p
, and then constructs c = m
bob
· g
xy
.
Bob: Sends over (c = m
bob
· g
xy
) to Alice.
El-Gamal Protocol (Decryption)
Alice: Decrypts the message using the shared secret = g
xy
and computing
its inverse: g
xy
. Then, m
bob
= c · g
xy
= m
bob
· g
xyxy
= m
bob
· g
0
=
m
bob
· 1 = m
bob
.
The underyling fact here is that the calculation of (g
xy
)
1
is in P (for Alice),
as calculating the modular multiplicative inverse can be accomplished with a
clever use of Euclid’s Divisor Algorithm or Euler’s Theorem. For our analysis,
we will consider the use of the latter.
El-Gamal Protocol (Analysis)
We will show that computing the modular multiplicative inverse is in P, which
allows for efficient decryption. Furthermore, we show that for an adversary
with access to a channel between Alice & Bob, computing the secret keys is
hard (discrete-log problem).
Theorem 1: Decrypting a message in the El-Gamal protocol is in P.
Proof: For decryption to be in P, we need to show that computing (g
xy
)
1
given (x, g
xy
, |G
p
| = p1) is in P, as modular multiplication is computationally
easy.
Notice that g
xy
G
p
. As G
p
=
Z
×
p
, we see that, gcd(g
xy
, p) = 1, because every
k < p is relatively prime to a prime number p. This allows us to use Euler’s
Theorem, which states the following:
3
Eulers’ Th
m
: If gcd(a, m) = 1, a
φ(m)
= 1 (mod) m
Letting a = g
xy
, we can use the Theorem to compute (g
xy
)
1
. Notice that,
in our case, m = p, where p is prime. So, φ(m) = φ(p) = p 1. Therefore,
(g
xy
)
φ(p)
= 1 mod p. This further reduces to (g
xy
)
p1
= 1 mod p. Multiplying
by (g
xy
)
1
= g
xy
on both sides, we have:
(g
xy
)
p1
· g
xy
= g
xy
mod p = (g
xy
)
p1
· (g
xy
)
1
= (g
xy
)
1
mod p =
(g
xy
)
p2
= (g
xy
)
1
mod p.
This gives us our required inverse: (g
xy
)
1
= (g
xy
)
p2
mod p, which can be
computed given (g
xy
, p, G
p
), all of which are terms that Alice has.
To see that this is P, notice that Alice computes g
xy
from (g
x
, y) which requires
at most log(y) multiplications, followed by another log(p 2) multiplications,
which can be computed efficiently as well. Given O(log(p)) = |bits|(p) and
O(log(g
xy
)) = |bits|(g
xy
) <= O(log(p)), this Algorithm is run by repeatedly
computing the power of g
xy
, O(log(p)) times. Now, the multiplication takes
Quadratic-Time with respect to the number of bits: O(log(g
xy
))
2
, and it is re-
peated O(log(p)) times. So, the next complexity is O(log(g
xy
))
2
· O(log(p))
O(log(p))
3
P.
QED
Theorem 2: The El-Gamal protocol is secure, or equivalently, computing x
or y given (G
p
, p, g
x
, g
y
, g) NP.
Proof: The first thing to notice is that merely multiplying g
x
, g
y
does not
yield the shared secret g
xy
, but rather yields g
x
g
y
= g
x+y
, which is not used
in the protocol. The only way to construct g
xy
for the adversary is to extract
x, y from the information, and then exponentiate efficiently (as shown before).
This means that the adversary needs to compute (log(g
x
), log(g
y
)) mod p, and
then exponentiate. However, this reduces to the Discrete-Log problem, which
is known to be in NP. The most efficient algorithm known to compute the
Discrete-Log is the Number-Field Sieve
8
, which runs in time
O(e
(log(p))
3
),
where the tilde hides Poly-logarithmic factors with constants. This guarantees
the security of the encryption and decryption protocol, by guaranteeing the
security of the shared secret (g
xy
), which is used to decrypt the encrypted mes-
sage.
QED
3 RSA Cryptosystem
The RSA Cryptosystem also uses the computational hardness of the underlying
protocol to provide a one-way hard function, where computing the function is in
P in one direction, but in NP in the other direction. The security of the RSA
problem comes from the difficulty of Integer Factorization
9
, which is known
to be in NP. The proof of correctness follows from Fermat’s Little Theorem.
In order to understand the protocol, we must lay the foundations by defining the
Carmichael Tuotient Function, which resembles the Euler Tuotient function in
its multiplicative nature, but differs slightly in using the lcm, thereby becoming
4
more efficient for computational purposes.
Prerequisites
Carmichael Tuotient Function λ(n): Gives the smallest integer k, st, a
k
=
1 mod n, k < n and (k, n) = 1.
We know that every n N can be written as the product of its prime fac-
tors n =
s
i=1
p
r
i
i
, where p
i
prime. The multiplicative nature, there-
fore, of the Carmichael Tuotient Function allows us to see that λ(n) =
lcm(λ(p
r
1
1
), .., λ(p
r
s
s
)).
We notice that λ(n)|φ(n) because the order of Z
×
n
= φ(n), and by La-
grange’s Theorem, the order of any element in the group must divide the
order of the group. It is also critical to note that λ(p
r
i
i
) = C · φ(p
r
i
i
) =
C ·φ(p
i
)
r
i
= C ·p
r
i
1
i
(p
i
1), where C = 1/2 if r > 4 and C = 1 otherwise.
With this setup, we can define the RSA protocol between two parties (say, Alice
& Bob) to generate public keys in a secure manner given the prerequisites.
RSA Protocol (Encryption)
Alice: Chooses p, q Z
n
, st, p, q prime. This is easy to do as, once
again, Primality-Testing is known to be in P.
Alice: Constructs n = p · q. Note: This n is equivalent to the n above,
which is merely introduced for the sake of clarity. The integer n is de-
pendent on p, q and not vice-versa. It is clear that p · q is the prime
factorization of n.
Alice: Efficiently computes secret key
Alice
= λ(n) = lcm(λ(p), λ(q)).
Alice: Chooses e {1, .., n 1}, st, gcd(e, λ(n)) = 1. Once again, this
is easily satisfied if e prime, which can be constructed in P using
Primality-Testing.
Alice: Sends public key
Alice
= e to Bob as: (n, e).
Bob: Maps message m m
bob
Z
n
, and then constructs c = (m
bob
)
e
mod n.
Bob: Sends over (c = (m
bob
)
e
mod n).
RSA Protocol (Decryption)
Alice: Computes the decryption-key: d = e
1
mod λ(n). Equivalently,
we need d · e = 1 mod λ(n). As we saw before for the El-Gamal protocol,
given that (e, λ(n)) = 1, we can use Euler’s Theorem to compute d in
Polynomial time (efficiently).
5
Alice: Computes m
bob
mod n = c
d
mod n = (m
e
)
d
mod n. The correctness
of this statement will be established by Fermat’s Little Theorem.
The underlying goal here is to show that Alice can compute λ(n) efficiently,
while an adversary cannot, and that the decryption protocol is correct. This
analysis will borrow the use of Euler’s Theorem for efficient (in P) construction
of an inverse from our previous analysis of the El-Gamal protocol.
RSA Protocol (Analysis)
We will show that attempting to compute λ(n) is inefficient (for the adversary),
that is, it is in NP, thereby, not allowing the adversary to realize the value of d,
given access to e. We will also show that it is efficient for Alice to compute this
value of λ(n), and that the protocol is correct, which will be seen as a corollary
of Fermat’s Little Theorem.
Theorem 1: Decrypting a message in the RSA protocol is in P .
Proof: To prove this, we merely need to show that the construction of d
P for Alice. We note here that m = λ(n) = λ(p · q) = lcm(λ(p), λ(q)) =
lcm(p 1, q 1) = (p 1) · (q 1) · (1/gcd(p, q)), where, gcd(p, q) can be
efficiently (in P) computed using the Euclidean Divisor Algorithm. Now, bor-
rowing from our previous analysis using Euler’s Theorem, the inverse of e
mod m becomes e
m2
mod m which, on substituting for m, yields: (e
λ(n)2
)
mod λ(n) = (e
(p1)·(q1)·(1/gcd(p,q))2
) mod λ(n) = d mod λ(n), which (again,
by our previous analysis) is in P. This follows because Z
×
λ(n)
is a multiplicative,
abelian group.
Theorem 2: Decrypting a message in the RSA protocol is in-feasible, or equiv-
alently, computing d given (e, n) NP for an adversary.
Proof: The goal of the adversary is to decrypt c = (m
bob
)
e
mod n, given (e, n).
Note that, in order to do this, the adversary needs to compute λ(n), otherwise
the adversary cannot construct d. To do so, the adversary needs to prime-
factorize λ(n) = lcm(
s
i=1
(λ(p
i
))
r
i
. The adversary, however, has no knowledge
of p, q, which are chosen at random. Therefore, the adversary must construct
the Integer Factorization of n. However, it is known to be the case that Integer-
Factorization NP.
Theorem 3: (e, λ(n)) = 1 d · e = 1 mod λ(n) = (m
e
)
d
mod n =
m.
Proof: This proof relies on Fermat’s Little Theorem, which states the fol-
lowing:
Fermat’s Little Theorem: Given p prime, st, p |a, a
p1
= 1 mod p.
Noticing that λ(n) = lcm(p 1, q 1), we see that, λ(n) = a(p 1) and
λ(n) = b(q1) for some a, b Z. As d·e = 1 mod λ(n), we see that λ(n)|(ed1).
Noticing that λ(n) can be expressed as multiples of (p 1) and (q 1), we can
conclude via transitivity that (p 1)|(ed 1) and (q 1)|(ed 1). This allows
6
us to evaluate (m
e
)
d
mod p and (m
e
)
d
mod q. We can then compute (m
e
)
d
mod pq by multiplying the LHS and the RHS. Let, (ed 1) = a
(p 1) and
(ed 1) = b
(q 1) for some a
, b
Z. Then:
(m
e
)
d
mod p = (m
ed1
) · m mod p = m
a
(p1)
· m mod p. By Fermat’s
little theorem, (m
p1
)
a
mod p = 1 mod p. Therefore, m
a
(p1)
· m mod
p = m mod p.
Similarly, (m
e
)
d
mod q = (m
ed1
) · m mod q = m
b
(q1)
· m mod q. Using
Fermat’s Little Theorem again, we see that m
b
(q1)
· m mod q = m mod
q.
Given that m
ed
mod p = m mod p and m
ed
mod q = m mod q, a direct,
modular multiplication yields that m
ed
mod pq = m mod pq = m mod n.
This proves that d is the decryption key to recover the ciphertext and establishes
the correctness of the RSA decryption protocol.
Conclusion and Further-Work
It is evident from the above analysis that the subtle interplay between Algebra
and Complexity gives rise to the desired security properties for the aforemen-
tioned protocols. The centrality of Group-Theoretic constructions to the most
widely used Key-Generation protocols in Asymmetrical-Key Cryptography is
undeniable. Over the years, many more Asymmetrical Key Generation proto-
cols have been developed, with a current trend of shifting more towards Lattice-
Based Cryptography. Exploring the recent developments in this area would
make for an educational survey. Other interesting developments in recent years
have included the study of Homomorphic Encryption and the resilience (or lack
thereof) of Asymmetrical Key Generation protocols to Quantum adversaries.
Acknowledgements
I would like to thank Dr. Martinez immensely for his impeccable teaching, never-
ending guidance and the numerous deep conversations about the intersections
of Pure Math, Theoretical Computer Science and Mathematical Physics. In
particular, his contribution of pushing me to write a survey exploring the inter-
sections between Cryptography & Algebra was the starting point for this work,
and I owe it unanimously to him. I would also like to thank Lily (Jisoo) Jeong
for helping me settle down on a viable list of topics to pursue and for generously
agreeing to provide feedback on the manuscript.
7
Reference
1. Hoskisson, P. (2010). Breakthrough Translation of Avicenna’s Physics
Published. INSIGHTS, Volume 30, No. 1, 1-4.
2. Rijmenants, D. (2004). Technical Details of the Enigma Machine, Cipher
Machines And Cryptology.
3. Jarecki, S. (09/28/2004). Lecture 1: Crypto Overview, Perfect Secrecy,
One-time Pad.
4. Goldwasser, S & Bellare, M. (2008). 10.3.3 El Gamal’s Scheme. Lecture
Notes of Cryptography, 171-172.
5. Goldwasser, S & Bellare, M. (2008). C.8 RSA. Lecture Notes on Cryptog-
raphy, 262-263.
6. Pomerance, C (2009). Discrete Logarithms, Dartmouth College.
7. Schoof, R. (2008). Four Primality Testing Algorithms. Algorithmic Num-
ber Theory, Volume 44, 101-126: MSRI Publications
8. Schirokauer, O. (2008). The impact of the number field sieve on the
discrete logarithm problem in finite fields. Algorithmic Number Theory,
Volume 44, 397-420: MSRI Publications
9. Bimpikis, K & Jaiswal, R. Modern Factoring Algorithms, UC San Diego.
8