Cryptographic Concepts - Symmetric Encryption
Sep 29, 2017
In this and further Cryptographic Concepts series posts, I will be using a few terms that I feel I should define ahead of time.
- Plaintext: Text that is not encrypted; Original message
- Ciphertext: Text after it has been encrypted; Encrypted message
- Key: Data used to encrypt and/or decrypt messages
- Cipher: The method to encrypt/decrypt messages
Symmetric Key Encryption
Symmetric Encryption utilizes a single key, which is used both to encrypt AND decrypt a message. In the Definitions section, I state that a key is used to encrypt "and/or" decrypt messages, but that is because there is another encryption method called Asymmetric Encryption, which uses a different key to encrypt messages than does to decrypt messages. Asymmetric Encryption will be talked about in the next part of the Cryptographic Concepts series.
Old and Broken Ciphers
The old symmetric key encryption methods that are broken that this post will discuss are the Substitution Cipher, Caesar Cipher, and Vigenere Cipher. These methods are generally quite easy to break with a computer and/or statistical analysis, but are still good learning tools (and fun to use for "back of the cereal box" crypto puzzles).
Many older cryptographic methods used a substitution method, where each letter mapped to a different letter. All that matters is that a letter doesn't have 2 letters mapped to it. An example substitution cipher letter map is shown below.
A → D B → L C → W D → U E → M F → J G → V H → Q I → R J → P K → C L → S M → O N → T O → N P → Y Q → K R → H S → F T → X U → Z V → E W → I X → B Y → A Z → G
There isn't really a pattern for outputs, so the key would be the entire letter map. If you wanted someone to decrypt a message, they'd need the entire letter map to do so. With this particular letter map, the message QMSSN would decrypt to HELLO. You just take the output letters and find the input that goes with it using this letter map.
The problem with this cipher is that one letter ALWAYS maps to one other letter, so you can tell when a plaintext message uses the same letter just by the ciphertext. Statistical analysis can be used to decrypt ciphertext, because some letters are used much more often in plaintext messages than other letters, for instance, the letter E.
Caesar Cipher (Shift Cipher)
Another form of the substitution cipher is a Ceasar Cipher, also sometimes called the shift cipher, where every letter maps to a different letter, but this time, the letters are simply shifted ahead rather than being completely different. For example, we can shift the alphabet by 3, making A → D, B → E, C → F and so forth. When the shifted alphabet gets to Z, it just starts over at A again, so in this case, V → Y, W → Z, and X → A. The alphabet with a shift value of 3 is shown below.
A → D B → E C → F D → G E → H F → I G → J H → K I → L J → M K → N L → O M → P N → Q O → R P → S Q → T R → U S → V T → W U → X V → Y W → Z X → A Y → B Z → C
Because we are just shifting every letter by the same amount, the key would simply be 3 rather than the entire letter map. If you want to decrypt a message using a shift value of 3, all you would have to do is shift the letters back by 3. As an example, the ciphertext KHOOR would decrypt to the plaintext HELLO when all of the letters are shifted back by 3.
This example used a key of 3 because Julius Caesar apparently used a key of 3 for sending military related messages. So the plaintext message VENI VIDI VICI (I came, I saw, I conquered) would have created the ciphertext YHQL YLGL YLFL.
A major downfall for this cipher is that EVERY letter is shifted by the same key, so you would need at most 26 attempts to decrypt a message, with less tries needed if you use statistical analysis of letters being used. This cipher also has the problem where if a letter appears more than once in the ciphertext, you know that letters in the plaintext in those positions are the same as each other as well, the same problem the substitution cipher mentioned earlier had.
The Vigenere Cipher is similar to the Caesar Cipher but fixes the problem of letter reuse. To create a key, you need to come up with a word. In this case we will use the word CRYPTO. Now you need to repeat the key over and over again until it's as long as the plaintext you want to encrypt. For this example, we'll use HELLOWORLD is our plaintext, so the key will become CRYPTOCRYP. Because the key repeats over and over again, we only need to give out the key CRYPTO in order for someone to decrypt the ciphertext message. We now assign a shifting value to each letter of the key, so A → 0, B → 1, C → 2, ..., X → 23, Y → 24, Z → 25.
To encrypt a message with this key, we shift align the plaintext with the key and shift each letter in the plaintext according to the shift value of the letter of the key in that position, H + 2 → J, E + 17 → V, L + 24 → J, L + 15 → A, O + 19 → H, W + 14 → K, O + 2 → Q, R + 17 → I, L + 24 → J, D + 15 → S, so the plaintext HELLOWORLD encrypts to the ciphertext JVJAHKQIJS. To decrypt the plaintext, you need to shift the letters of the ciphertext backwards by the key.
Statistical analysis attacks can work on Vigenere Ciphers, although they are much harder to do. With that said, this cipher is still broken. The more messages that use the same key, the easier it will be to decrypt the ciphertext without needing to be given the key.
Perfect Secrecy Cipher
While the ciphers mentioned before can be broken by statistical analysis or simple brute force on a computer (or even by hand in many cases), there is an encryption cipher that is perfectly secure. To be perfectly secure, no information about the plaintext can be gained by seeing the ciphertext. That cipher is called the One Time Pad.
One Time Pad (OTP)
The One Time Pad is an encryption cipher that uses a randomly generated key ONCE and ONLY ONCE. The randomness must be truly random and the key cannot be used more than once. The key is a random set of letters that is as long as the plaintext message, so for the plaintext message HELLO, we could use a randomly generated key of ZTOIU.
Just like the Vigenere Cipher, we use the key to figure out how many letters to shift each plaintext letter ahead by, H + 25 → G, E + 19 → X, L + 14 → Z, L + 8 → T, O + 20 → I, so the plaintext HELLO would encrypt to the ciphertext. To decrypt the ciphertext, you need to shift the letters of the ciphertext backwards by the key.
Because every letter is just as likely to appear in any order of the key, it is impossible to decrypt the ciphertext without knowing the key. That is of course as long as you NEVER reuse keys and ALWAYS use a truly random source to select letters from.
We can use One Time Pads for binary data as well. We have a plaintext that is a string of binary digits and a randomly generated string of binary digits that is just as long for the key. Just like with the letters, we can shift the binary digits as well. This process is called "Exclusive Or" (XOR), but works exactly the same as shifting the letters from before, just with binary digits instead. 0 shifted by 1 is 1, 1 shifted by 1 wraps around and becomes 0, 0 shifted by 0 is 0, and 1 shifted by 0 is 1. If you are using a true source of randomness to generate the key, every binary digit in the ciphertext should have a 50% chance of being a 0 and a 50% chance of being a 1, no matter what the plaintext is.
I won't be going into as much detail about how these modern encryption ciphers work, but keep in mind that like the previous mentioned ciphers, you can encrypt and decrypt a message using the same key. Symmetric Encryption tends to be one of two types of ciphers, a Stream Cipher or a Block Cipher.
Stream ciphers are encryption ciphers that you encrypt or decrypt one unit at a time. All of the previous examples given, the Substitution Cipher, Caesar Cipher, Vigenere Cipher, and the One Time Pad, are stream ciphers, as you encrypt or decrypt the messages one unit as a time. In the case of using only letters, the unit is a single letter, and in the case of using strings of binary digits, a single binary digit is the unit.
Some well known stream ciphers used today are Salsa20 and a variant family of ciphers called ChaCha.
Block ciphers encrypt and decrypt messages in chunks called blocks. These blocks have specific sizes defined by the encryption cipher itself. An older (and broken) block cipher standard is DES, which stands for Data Encryption Standard and used a block size of 64 bits. We now have a much better block cipher standard called AES (Advanced Encryption Standard) which has a block size of 128 bits and a key size of either 128 bits (AES128), 192 bits (AES192), or 256 bits (AES256). Block ciphers are usually only suitable for encrypting plaintexts the size of the block, so we need to use what is called a Block Cipher Mode of Operation.
Block Cipher Mode of Operation
I will not be going into too much detail about modes of operation, but there are quite a few. This Wikipedia page does a very good job at explaining the modes of operation and it is heavily recommended at the moment to use GCM (Galois/Counter Mode), which provides confidentiality of data and prevents accidental modification or malicious tampering of encrypted data.
This post explains symmetric key cryptography, a form of cryptography that uses a single key to encrypt a message into ciphertext and then decrypt it back into plaintext. People have been using symmetric encryption for quite a while and we still use it today. Modern encryption cipher standards like AES256 with GCM do well at encrypting plaintext in a way that requires the key to decrypt the ciphertext. Without the key, you shouldn't be able to get the plaintext from the ciphertext.
Come back for the next blog post which will be discussing Asymmetric Cryptography, where the key to encrypt and decrypt data are different, which allows others to encrypt a message to you that only you can decrypt.