Le "many-times pad"

ou pourquoi il ne faut pas réutiliser une clef

(Python 3)

L'exercice suivant (extrait d'un cours en ligne de Stanford University) montre comment la réutilisation d'une clef OTP peut permettre de la reconstituer.

Voici 11 textes chiffés avec la même clef. Le but est de déchiffrer le 11ème.

In [1]:
#ciphertext #1:
s1="315c4eeaa8b5f8aaf9174145bf43e1784b8fa00dc71d885a804e5ee9fa40b16349c146fb778cdf2d3aff021dfff5b403b510d0d0455468aeb98622b137dae857553ccd8883a7bc37520e06e515d22c954eba5025b8cc57ee59418ce7dc6bc41556bdb36bbca3e8774301fbcaa3b83b220809560987815f65286764703de0f3d524400a19b159610b11ef3e"

#ciphertext #2:
s2="234c02ecbbfbafa3ed18510abd11fa724fcda2018a1a8342cf064bbde548b12b07df44ba7191d9606ef4081ffde5ad46a5069d9f7f543bedb9c861bf29c7e205132eda9382b0bc2c5c4b45f919cf3a9f1cb74151f6d551f4480c82b2cb24cc5b028aa76eb7b4ab24171ab3cdadb8356f"

#ciphertext #3:
s3="32510ba9a7b2bba9b8005d43a304b5714cc0bb0c8a34884dd91304b8ad40b62b07df44ba6e9d8a2368e51d04e0e7b207b70b9b8261112bacb6c866a232dfe257527dc29398f5f3251a0d47e503c66e935de81230b59b7afb5f41afa8d661cb"

#ciphertext #4:
s4="32510ba9aab2a8a4fd06414fb517b5605cc0aa0dc91a8908c2064ba8ad5ea06a029056f47a8ad3306ef5021eafe1ac01a81197847a5c68a1b78769a37bc8f4575432c198ccb4ef63590256e305cd3a9544ee4160ead45aef520489e7da7d835402bca670bda8eb775200b8dabbba246b130f040d8ec6447e2c767f3d30ed81ea2e4c1404e1315a1010e7229be6636aaa"

#ciphertext #5:
s5="3f561ba9adb4b6ebec54424ba317b564418fac0dd35f8c08d31a1fe9e24fe56808c213f17c81d9607cee021dafe1e001b21ade877a5e68bea88d61b93ac5ee0d562e8e9582f5ef375f0a4ae20ed86e935de81230b59b73fb4302cd95d770c65b40aaa065f2a5e33a5a0bb5dcaba43722130f042f8ec85b7c2070"

#ciphertext #6:
s6="32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd2061bbde24eb76a19d84aba34d8de287be84d07e7e9a30ee714979c7e1123a8bd9822a33ecaf512472e8e8f8db3f9635c1949e640c621854eba0d79eccf52ff111284b4cc61d11902aebc66f2b2e436434eacc0aba938220b084800c2ca4e693522643573b2c4ce35050b0cf774201f0fe52ac9f26d71b6cf61a711cc229f77ace7aa88a2f19983122b11be87a59c355d25f8e4"

#ciphertext #7:
s7="32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd90f1fa6ea5ba47b01c909ba7696cf606ef40c04afe1ac0aa8148dd066592ded9f8774b529c7ea125d298e8883f5e9305f4b44f915cb2bd05af51373fd9b4af511039fa2d96f83414aaaf261bda2e97b170fb5cce2a53e675c154c0d9681596934777e2275b381ce2e40582afe67650b13e72287ff2270abcf73bb028932836fbdecfecee0a3b894473c1bbeb6b4913a536ce4f9b13f1efff71ea313c8661dd9a4ce"

#ciphertext #8:
s8="315c4eeaa8b5f8bffd11155ea506b56041c6a00c8a08854dd21a4bbde54ce56801d943ba708b8a3574f40c00fff9e00fa1439fd0654327a3bfc860b92f89ee04132ecb9298f5fd2d5e4b45e40ecc3b9d59e9417df7c95bba410e9aa2ca24c5474da2f276baa3ac325918b2daada43d6712150441c2e04f6565517f317da9d3"

#ciphertext #9:
s9="271946f9bbb2aeadec111841a81abc300ecaa01bd8069d5cc91005e9fe4aad6e04d513e96d99de2569bc5e50eeeca709b50a8a987f4264edb6896fb537d0a716132ddc938fb0f836480e06ed0fcd6e9759f40462f9cf57f4564186a2c1778f1543efa270bda5e933421cbe88a4a52222190f471e9bd15f652b653b7071aec59a2705081ffe72651d08f822c9ed6d76e48b63ab15d0208573a7eef027"

#ciphertext #10:
s10="466d06ece998b7a2fb1d464fed2ced7641ddaa3cc31c9941cf110abbf409ed39598005b3399ccfafb61d0315fca0a314be138a9f32503bedac8067f03adbf3575c3b8edc9ba7f537530541ab0f9f3cd04ff50d66f1d559ba520e89a2cb2a83"

#target ciphertext (decrypt this one):
s11="32510ba9babebbbefd001547a810e67149caee11d945cd7fc81a05e9f85aac650e9052ba6a8cd8257bf14d13e6f0a803b54fde9e77472dbff89d71b57bddef121336cb85ccb8f3315f4b52e301d16e9f52f904"
In [2]:
# Commençons par une petite fonction utilitaire
from codecs import encode, decode

def h2i(xx):
    return decode(xx, 'hex')

# Exemple : 0x31 = 49 = ord('1')
print (s1[:20])
print (h2i(s1[:20]))
315c4eeaa8b5f8aaf917
b'1\\N\xea\xa8\xb5\xf8\xaa\xf9\x17'
In [3]:
# Appliquons la à chaque cryptogramme
ss=[h2i(eval('s'+str(i))) for i in range(1,12)]

# Et calculons les XOR deux à deux des cryptogrammes
bb ={}
for i in range(len(ss)):
    for j in range(i+1,len(ss)):
        aa=zip(ss[i],ss[j])
        bb[i,j]=[a[0]^a[1] for a in aa]
        

# bb[i,j] est donc aussi le XOR des textes clairs i et j

# La première idée est de commencer à repérer les espaces
print (ord(' '))
print (bin(ord(' ')))
32
0b100000
In [4]:
# En effet, le XOR avex 32 = 0x20 = 0b100000 échange les majuscules et les minuscules !
from string import ascii_letters as letters
ll=[ord(' ')^ord(c) for c in  letters]
print (letters)
print (''.join(map(chr,ll)))
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
In [5]:
# On aura donc toujours des lettres aux positions  k des espaces des messages i ou j (on en sait pas lequel)
# Par exemple, pour chacune des positions suivantes, il y a un espace dans le message 2 ou dans le message 4
' '.join([str(k) for k in bb[2,4] if chr(k) in letters])
Out[5]:
'66 84 84 79 89 107 69 81 79 83 67 87 75 83 67 79 82 69 79 67 69 90 83 76 69 67 98'
In [6]:
# On construit un dictionnaire kk de valeurs possibles des octets de la clef
# kk[i] est la liste des valeurs qui peuvent donner un espace aux positions possibles
# dans chaque chaîne

kk ={}

for k in bb:     # (k est un couple)
    for i in range(len(bb[k])):
        if chr(bb[k][i]) in letters: # possibilités pour l'octet i de la clef
            if i in kk: kk[i].extend([ss[k[0]][i]^32, ss[k[1]][i]^32])
            else: kk[i] = [ss[k[0]][i]^32, ss[k[1]][i]^32]

print (kk[0], len(kk[0]))
[17, 102, 3, 102, 18, 102, 18, 102, 31, 102, 18, 102, 18, 102, 17, 102, 7, 102, 102, 18] 20
In [17]:
# On se retrouve avec 2O possibilités, pas toutes différentes
for i in set(kk[0]): print (i, kk[0].count(i))
3 1
102 10
7 1
17 2
18 5
31 1
In [18]:
# On voit que 102 est plus fréquent que les autres. On fait donc l'hypothèse que le premier octet de la clef est 102
hex(102)
Out[18]:
'0x66'
In [7]:
# On automatise ce calcul
def most_common(xx):
    aa = set(xx)
    m = max([xx.count(x) for x in aa])
    return  [x for x in aa if xx.count(x)==m]

sols = dict([(i,most_common(kk[i])) for i in kk])

# Il peut rester plusieurs choix 
print (sols)
{2: [110], 5: [219], 6: [216], 11: [42], 13: [99], 17: [175], 20: [170], 24: [160], 25: [110], 27: [201], 31: [11], 32: [105], 35: [154], 39: [64], 40: [26], 47: [102], 50: [254], 51: [240], 54: [72], 55: [205], 57: [232], 58: [2], 63: [119], 64: [51], 73: [107], 74: [38], 80: [60], 83: [16], 84: [149], 89: [97], 91: [199], 93: [4], 95: [53], 96: [34], 102: [140], 103: [87], 104: [55], 106: [219], 111: [2], 3: [137], 8: [152], 14: [149], 26: [107], 28: [141], 38: [170], 53: [49], 65: [93], 69: [213], 70: [156], 72: [58], 78: [78], 81: [154], 82: [97], 85: [187], 23: [40], 33: [176], 44: [143], 60: [91], 68: [236], 71: [67], 94: [163], 114: [36], 117: [161], 123: [80], 126: [161], 132: [145], 133: [17], 7: [203], 9: [116], 21: [127], 30: [197], 34: [51], 46: [192], 66: [174], 90: [237], 100: [210], 19: [120], 22: [237], 36: [25], 37: [248], 42: [109], 48: [199], 76: [96], 88: [49], 105: [110], 116: [226], 121: [2], 124: [16], 125: [192], 129: [37], 134: [0], 79: [240], 98: [210], 108: [194], 112: [124], 130: [120], 10: [53], 49: [99], 61: [169], 87: [154], 115: [97], 120: [69], 1: [57], 15: [16], 16: [46], 41: [156], 43: [112], 62: [135], 97: [207], 107: [168], 127: [186], 0: [102], 4: [201], 12: [205], 29: [41], 45: [128], 52: [18], 59: [208], 67: [252], 75: [139], 77: [191], 18: [206], 56: [216], 139: [233], 141: [2], 122: [27], 143: [196], 148: [169], 154: [138], 155: [168, 238], 156: [192, 130], 157: [209, 131], 160: [50, 103], 169: [76, 5], 144: [171, 239]}
In [8]:
# Pour voir, on essaie le premier élément de chaque liste s'il y en a plusieurs
for j in range(11):
    t = []
    for i in range(len(ss[j])):
        if i in sols: t.append( chr(ss[j][i]^sols[i][0] ) )
        else: t.append('?')
    print (''.join(t))
We can aactor the number  5 with quantum computers. We can also factor the number 15-w?th a ?og tra?n?d to ba?? t?rhe ??me - Ro?er? Ha????
Euler whuld probably enjoh that now his theorem becomes a corner stone of crypto - Acn?nymou? on Eu?e?'s theo??m
The nicb thing about Keey}oq is now we cryptographers can drive a lot of fancy cars   ?an Bo?eh
The cipoertext produced bh a weak encryption algorithm looks as good as ciphertext po?uced ?y a st?o?g encry??io? llg??itdm - P?il?p Z????r?a?n
You don t want to buy a stt of car keys from a guy who specializes in stealing cars   ?arc R?tenber? ?ommenti?? o? Nli??er
There aue two types of crhptography - that which will keep secrets safe from your liyt?e sis?er, an? ?hat whi?? w?la k??p ecret? s?fe ???? ?o?rd???e?????  b ?? ????????i??
There aue two types of cyatography: one that allows the Government to use brute forch ?o bre?k the ?o?e, and ??e ?hlt ??queres t?e ?ove????n? ?od??? ?????tf r??u???????? ????????????????
We can tee the point whert the chip is unhappy if a wrong bit is sent and consumes mbr? powe? from ?h? enviro??en?   A?? Sdamir
A (privfte-key)  encrypti~n scheme states 3 algorithms, namely a procedure for generlt?ng ke?s, a p?o?edure f?? e?cyp??ng  and ? p?oce???? ?o?  ???y?????z
 The Coicise OxfordDictioary (2006) defines crypto as the art of  writing o r solvdn? code?. 
The secuet message is: Whtn using a stream cipher, never use the key more than once
In [9]:
# Ce n'est pas tout-à-fait bon, mais on voit du texte clair apparaître
# Pour finir, il suffit de deviner le déchiffrage correct d'au moins un des messages, par exemple le premier 
# (google retrouve la citation et son auteur)
q="We can factor the number 15 with quantum computers. We can also factor the number 15 with a dog trained to bark three times -- Robert Harley"

k1=[ord(q[i])^ss[0][i] for i in range(len(ss[0]))]
m10=[k1[i]^ss[10][i] for i in range(len(ss[10]))]
''.join(map(chr,m10))
Out[9]:
'The secret message is: When using a stream cipher, never use the key more than once'
In [ ]:
 
In [ ]: