Vigenère cipher

The Vigenère cipher is a method of encryption invented by Giovan Batista Belaso and described in his 1553 book La cifra del. Sig. Giovan Batista Belaso. It was misattributed to Blaise de Vigenère in the 19th century, and given his name. The cipher is a keyword-based system that uses a series of different Caesar ciphers based on the letters of the keyword. It is a simplified version of the more general polyalphabetic substitution cipher, invented by Alberti circa 1465.
This cipher is well-known because while it is easy to understand and implement, it often appears to beginners to be unbreakable. Consequently, many programmers have implemented obfuscation or encryption schemes in their applications which are essentially Vigenère ciphers, only to have them broken by the first cryptanalyst who comes along.
Description

In a Caesar cipher, each letter of the alphabet is shifted along some number of places; for example, in a Caesar cipher of shift 3, A would become D, B would become E and so on. The Vigenère cipher consists of using several Caeser ciphers in sequence with different shift values.
To encipher, a table of alphabets can be used, termed a tabula recta or a Vigenère square. It consists of the alphabet written out 26 times in different rows, each alphabet shifted cyclically to the right compared to the previous alphabet, corresponding to the 26 possible Caesar ciphers. At different points in the encryption, an encipherer uses a different alphabet from one of the rows. The alphabet used at each point depends on a repeating keyword.
For example, suppose the encipherer wishes to encrypt a plaintext:
- ATTACKATDAWN
The encipherer chooses a keyword and repeats it until it matches the length of the plaintext, for example, the keyword "LEMON":
- LEMONLEMONLE
The first letter of the plaintext, A, is enciphered using the alphabet in row L, which is the first letter of the key. This is done by looking at the letter in row L and column A of the Vigenère square, namely L. Similarly, for the second letter of the plaintext, the second letter of the key is used; the letter at row E and column T is X. The rest of the plaintext is enciphered in a similar fashion:
Plaintext: | ATTACKATDAWN |
Key: | LEMONLEMONLE |
Ciphertext: | LXEOPVEEQNHR |
Decryption is performed by finding the position of the ciphertext letter in a row of the table, and then taking the label of the column in which it appears as the plaintext. For example, in row L, the ciphertext L appears in column A, which taken as the first plaintext letter. The second letter is decrypted by looking up X in row E of the table; it appears in column T, which is taken as the plaintext letter.
Vigenère can also be viewed algebraically. If the letters A–Z are taken to be the numbers 0–25, and addition is performed modulo 26, then Vigenère encryption can be written,
and decryption,
Cryptanalysis
See also: Kasiski examination
The idea behind the Vigenère cipher is like that of all polyalphabetic ciphers — to make frequency analysis more difficult. Frequency analysis is the practice of decrypting a message by counting the frequency of ciphertext letters, and equating it to the letter frequency of normal text. For instance if P occurred most in a ciphertext whose plaintext is in English one could suspect that P corresponded to E, because E is the most frequently used letter in English. Using the Vigenère cipher, E can be enciphered as any of several letters in the alphabet at different points in the message thus defeating simple frequency analysis.
Noted author and mathematician Charles Ludwidge Dodgson (aka Lewis Carroll) called this cipher unbreakable in his 1868 piece "The Alphabet Cipher" in a children's magazine. In 1917, the Vigenère was described as "impossible of translation" in the respected journal Scientific American. Despite this reputation, however, the cipher can be broken.
The weakness relies on the fact that the key is relatively short and constantly repeated: as a result, common words like "the" will likely be encrypted using the same key letters, leading to repeated groups in the ciphertext. If the message is long enough, therefore, the following procedure can be used:
- A cryptanalyst looks for repeated groups of letters and counts the number of letters between the beginning of each repeated group. For instance if the ciphertext was FGXTHJAQWNFGXQ, the distance between FGX's is 10. The analyst repeats this for as many repeated groups as appear in the text.
- The analyst next factors each of these numbers. If any number is repeated in the majority of these factorings, this is probably the length of the keyword. this is because repeated groups can appear by coincidence, but are much more likely to occur when the same letters are encrypted using the same key letters. The key letters are repeated at multiples of the key length, so the distances found in step 1 are likely to be multiples of the key length.
- Once the keyword length is known, the clever observation of Babbage and Kasiski comes into play. If the keyword is N letters long, then every Nth letter must be enciphered using the same letter of the keytext. Grouping every Nth letter together, the analyst has N "messages", each encrypted using a one-alphabet substitution, and each piece can then be solved using frequency analysis.
- Using the solved message, the analyst can quickly determine what the keyword was. Or, in the process of solving the pieces, the analyst might use guesses about the keyword to assist in breaking the message.
- Once the interceptor knows the keyword, he or she can use that knowledge to read future messages, if the key does not change.
Kasiski actually used "superimposition" to solve the Vigenere cipher. He started by finding the key length, as above. Then he took multiple copies of the message and laid them one-above-another, each one shifted left by the length of the key. Kasiski then observed that each column was made up of letters encrypted with a single alphabet. His method was equivalent to the one described above, but is perhaps easier to picture.
Modern attacks on polyalphabetic ciphers are essentially identical to that described above, with the one improvement of coincidence counting. Instead of looking for repeating groups, a modern analyst would take two copies of the message and lay one above another. (Actually, modern analysts use computers, but this description illustrates the principle.) The analyst then shifts the bottom message one letter to the left, then two letters to the left, etc., each time going through the entire message and counting the number of times the same letter appears in the top and bottom message. The number of "coincidences" goes up sharply when the bottom message is shifted by a multiple of the key length, because then the adjacent letters are in the same language using the same alphabet. Having found the key length, cryptanalysis proceeds as described above using frequency analysis.
The cipher of Blaise de Vigenère
Vigenère actually invented a stronger cipher: an autokey cipher. The name "Vigenère cipher" became associated with this polyalphabetic cipher instead. In fact the two ciphers were often confused, and both were sometimes called "le chiffre indéchiffrable", or "the unbreakable cipher". For nearly 300 years this cipher was thought to be unbreakable, but Charles Babbage and Friedrich Kasiski independently found a way to break it the middle of the 19th century. Babbage actually broke the much stronger autokey cipher, while Kasiski is generally credited with the first published solution to fixed-key polyalphabetic ciphers.
External links
- The Vigenère Cipher as discussed on The Beginner's Guide to Cryptography
- Basic Cryptanalysis at H2G2
- Java Vigenere applet with source code (GNU GPL)