Skip to content

ElGamal

Overview

The security of the ElGamal algorithm is based on the difficulty of solving the discrete logarithm problem. It was proposed in 1984 and is also a dual-key cryptosystem that can be used for both encryption and digital signatures.

If we assume p is a decimal prime of at least 160 digits, and p-1 has a large prime factor, and furthermore g is a generator of Z_p^*, and y \in Z_p^*. Then finding a unique integer x (0\leq x \leq p-2) satisfying g^x \equiv y \bmod p is computationally difficult. Here we denote x as x=log_gy.

Basic Principles

Here we assume A wants to send a message m to B.

Key Generation

The basic steps are as follows:

  1. Select a sufficiently large prime p such that solving the discrete logarithm problem over Z_p is difficult.
  2. Select a generator g of Z_p^*.
  3. Randomly select an integer k, 0\leq k \leq p-2, and compute g^k \equiv y \bmod p.

The private key is {k}, and the public key is {p,g,y}.

Encryption

A selects a random number r \in Z_{p-1} and encrypts the plaintext E_k(m,r)=(y_1,y_2), where y_1 \equiv g^r \bmod p and y_2 \equiv my^r \bmod p.

Decryption

D_k(y_1,y_2)=y_2(y_1^k)^{-1} \bmod p \equiv m(g^k)^r(g^{rk})^{-1} \equiv m \bmod p.

Difficulty

Although we know y1, we have no way of knowing the corresponding r.

2015 MMA CTF Alicegame

Here we use the Alicegame challenge from MMA-CTF-2015 as an example. This problem was indeed quite difficult when the source code was not initially provided, because given an m and an r to get the encrypted result was hard to figure out.

Let's briefly analyze the source code. First, the program initially generates pk and sk:

    (pk, sk) = genkey(PBITS)

The genkey function is as follows:

def genkey(k):
    p = getPrime(k)
    g = random.randrange(2, p)
    x = random.randrange(1, p-1)
    h = pow(g, x, p)
    pk = (p, g, h)
    sk = (p, x)
    return (pk, sk)

p is a k-bit prime, g is a number in the range (2,p), and x is in the range (1,p-1). It also computes h \equiv g^x \bmod p. Seeing this, we can roughly tell that this should be an ElGamal encryption over a number field. Here pk is the public key and sk is the private key.

Next, the program outputs m and r 10 times and encrypts using the following function:

def encrypt(pk, m, r = None):
    (p, g, h) = pk
    if r is None:
        r = random.randrange(1, p-1)
    c1 = pow(g, r, p)
    c2 = (m * pow(h, r, p)) % p
    return (c1, c2)

The encryption method is indeed ElGamal encryption.

Finally, the program encrypts the flag. At this point, r is randomly generated by the program itself.

Analyzing this, we can control m and r in the ten rounds of the loop, and

c_1 \equiv g^r \bmod p

c_2 \equiv m * h^{r} \bmod p

If we set:

  1. r=1, m=1, then we can obtain c_1=g,c_2=h.
  2. r=1, m=-1, then we can obtain c_1=g, c_2 = p-h. Thus we can derive the prime p.

What use is knowing the prime p? p is about 201 digits long, which is very large.

However, after generating the prime p, no validation is performed. As we mentioned earlier, p-1 must have a large prime factor. If it has small prime factors, then we can attack it. The attack mainly uses the baby step-giant step and Pohlig-Hellman algorithm. Those interested can look into it. Here, sage's built-in discrete logarithm function can already handle such situations; see discrete_log.

The specific code is as follows. Note that this consumes a lot of memory, so don't casually run it on a virtual machine... Also, the interaction part was quite a headache...

import socket
from Crypto.Util.number import *
from sage.all import *


def get_maxfactor(N):
    f = factor(N)
    print 'factor done'
    return f[-1][0]

maxnumber = 1 << 70
i = 0
while 1:
    print 'cycle: ',i
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect(("localhost", 9999))
    sock.recv(17)
    # get g,h
    sock.recv(512)
    sock.sendall("1\n")
    sock.recv(512)
    sock.sendall("1\n")
    data = sock.recv(1024)
    print data
    if '\n' in data:
        data =data[:data.index('\n')]
    else:
        # receive m=
        sock.recv(1024)
    (g,h) = eval(data)

    # get g,p
    sock.sendall("-1\n")
    sock.recv(512)
    sock.sendall("1\n")
    data = sock.recv(1024)
    print data
    if '\n' in data:
        data = data[:data.index('\n')]
    else:
        # receive m=
        sock.recv(512)
    (g,tmp) = eval(data)
    p = tmp+h
    tmp = get_maxfactor(p-1)
    if tmp<maxnumber:
        print 'may be success'
        # skip the for cycle
        sock.sendall('quit\n');
        data = sock.recv(1024)
        print 'receive data: ',data
        data = data[data.index(":")+1:]
        (c1,c2)=eval(data)
        # generate the group
        g = Mod(g, p)
        h = Mod(h, p)
        c1 = Mod(c1, p)
        c2 = Mod(c2, p)
        x = discrete_log(h, g)
        print "x = ", x
        print "Flag: ", long_to_bytes(long(c2 / ( c1 ** x)))
    sock.sendall('quit\n')
    sock.recv(1024)
    sock.close()
    i += 1

In the end, due to insufficient computer memory, the computation didn't finish... It sometimes crashes, so run it a few more times.

2018 Code Blue lagalem

The problem description is as follows:

from Crypto.Util.number import *
from key import FLAG

size = 2048
rand_state = getRandomInteger(size // 2)


def keygen(size):
    q = getPrime(size)
    k = 2
    while True:
        p = q * k + 1
        if isPrime(p):
            break
        k += 1
    g = 2
    while True:
        if pow(g, q, p) == 1:
            break
        g += 1
    A = getRandomInteger(size) % q
    B = getRandomInteger(size) % q
    x = getRandomInteger(size) % q
    h = pow(g, x, p)
    return (g, h, A, B, p, q), (x,)


def rand(A, B, M):
    global rand_state
    rand_state, ret = (A * rand_state + B) % M, rand_state
    return ret


def encrypt(pubkey, m):
    g, h, A, B, p, q = pubkey
    assert 0 < m <= p
    r = rand(A, B, q)
    c1 = pow(g, r, p)
    c2 = (m * pow(h, r, p)) % p
    return (c1, c2)

# pubkey, privkey = keygen(size)

m = bytes_to_long(FLAG)
c1, c2 = encrypt(pubkey, m)
c1_, c2_ = encrypt(pubkey, m)

print pubkey
print(c1, c2)
print(c1_, c2_)

As we can see, the algorithm is an ElGamal encryption that gives two sets of encrypted results for the same plaintext. Its characteristic is that the random number r is generated through a linear congruential generator. Thus we know:

c2 \equiv m * h^{r} \bmod p

c2\_ \equiv m*h^{(Ar+B) \bmod q} \equiv m*h^{Ar+B}\bmod p

Then

c2^A*h^B/c2\_ \equiv m^{A-1}\bmod p

where c2, c2_, A, B, h are all known. So we know

m^{A-1} \equiv t \bmod p

Assuming we know a primitive root g of p, we can let

g^x \equiv t

g^y \equiv m

Then

g^{y(A-1)}\equiv g^x \bmod p

So

y(A-1) \equiv x \bmod p-1

And thus we know

y(A-1)-k(p-1)=x

Here we know A, p, x, so we can use the Extended Euclidean Algorithm to find

s(A-1)+w(p-1)=gcd(A-1,p-1)

If gcd(A-1,p-1)=d, then we can directly compute

t^s \equiv m^{s(A-1)} \equiv m^d \bmod p

If d=1, then we directly know m.

If d is not 1, then it becomes somewhat troublesome...

In this problem, d happens to be 1, so it can be solved quite easily.

import gmpy2
data = open('./transcript.txt').read().split('\n')
g, h, A, B, p, q = eval(data[0])

c1, c2 = eval(data[1])
c1_, c2_ = eval(data[2])

tmp = gmpy2.powmod(c2, A, p) * gmpy2.powmod(h, B, p) * gmpy2.invert(c2_, p)
tmp = tmp % p

print 't=', tmp
print 'A=', A
print 'p=', p
gg, x, y = gmpy2.gcdext(A - 1, p - 1)
print gg

m = gmpy2.powmod(tmp, x, p)
print hex(m)[2:].decode('hex')

flag

  2018-CodeBlue-lagalem git:(master)  python exp.py
t= 24200833701856688878756977616650401715079183425722900529883514170904572086655826119242478732147288453761668954561939121426507899982627823151671207325781939341536650446260662452251070281875998376892857074363464032471952373518723746478141532996553854860936891133020681787570469383635252298945995672350873354628222982549233490189069478253457618473798487302495173105238289131448773538891748786125439847903309001198270694350004806890056215413633506973762313723658679532448729713653832387018928329243004507575710557548103815480626921755313420592693751934239155279580621162244859702224854316335659710333994740615748525806865323
A= 22171697832053348372915156043907956018090374461486719823366788630982715459384574553995928805167650346479356982401578161672693725423656918877111472214422442822321625228790031176477006387102261114291881317978365738605597034007565240733234828473235498045060301370063576730214239276663597216959028938702407690674202957249530224200656409763758677312265502252459474165905940522616924153211785956678275565280913390459395819438405830015823251969534345394385537526648860230429494250071276556746938056133344210445379647457181241674557283446678737258648530017213913802458974971453566678233726954727138234790969492546826523537158
p= 36416598149204678746613774367335394418818540686081178949292703167146103769686977098311936910892255381505012076996538695563763728453722792393508239790798417928810924208352785963037070885776153765280985533615624550198273407375650747001758391126814998498088382510133441013074771543464269812056636761840445695357746189203973350947418017496096468209755162029601945293367109584953080901393887040618021500119075628542529750701055865457182596931680189830763274025951607252183893164091069436120579097006203008253591406223666572333518943654621052210438476603030156263623221155480270748529488292790643952121391019941280923396132717
1
CBCTF{183a3ce8ed93df613b002252dfc741b2}

References