Security Best Practices: Symmetric Encryption with AES in Java and Android
Best security practice: Use AES for symmetric encryption in Java and Android
In this article I will introduce you to the Advanced Encryption Standard (AES), common block patterns, why vectors need to be populated and initialized, and how to protect data from tampering. Finally, I’ll show you how to easily implement this functionality using Java, avoiding most security issues.
Why does every software engineer need to know AES
AES, also known as Rijndael encryption algorithm, was chosen by NIST in 2000 to replace the outdated Data Encryption Standard (DES). AES is a block cipher, which means that encryption takes place over groups of bits of fixed length. In our example, the algorithm defines a block length of 128 bits. AES supports key lengths of 128,192 and 256 bits.
Each block goes through multiple rounds of transformation. I will omit the details of the algorithm here, but readers interested in algorithms should refer to the Wikipedia article on AES. It should be noted here that the block size is affected by the number of iterations of the conversion round (10 cycles for a 128-bit key and 14 cycles for a 256-bit key), while the key length does not affect its size.
Until May 2009, the only successful release of attacks against the full AES was for specific implementationsThe bypass attack. (resources)
Want to encrypt multiple blocks?
AES encrypts only 128 bits of data. If we want to encrypt the entire message, we need to choose a block mode that encrypts multiple blocks into a single ciphertext. The simplest block mode is the electronic codebook or ECB. It will use the same unchanged key in each block:
This would be particularly bad because the same plaintext would be encrypted into the same ciphertext.
Try it out for yourself
Remember, never choose this mode unless you only encrypt data less than 128 bits. Unfortunately, it’s still often misused because it doesn’t require you to provide initial vectors (more on that later), so it seems easier for developers to deal with.
One case that must be handled using block mode: What happens if the last block size is less than 128 bits? This is where padding comes into play, the missing bits of the padding block. The easiest way is to fill the missing bits with zeros. Choosing to fill in AES has few security implications.
Password Block linking (CBC)
So what is the alternative to the ECB? In CBC, for example, xOR is performed with the current plaintext block and the previous ciphertext block. In this method, each ciphertext block depends on all the plaintext blocks that precede it. Using the same image as before, the encrypted result will be random data indistinguishable from noisy data:
So what do we do with the first block? The simplest approach is to use a full fill block (such as zero), but this will produce the same ciphertext each time the same key and plaintext are encrypted. In addition, if you reuse the same key for different plaintext, it will be easier to recover the key. A better approach is to use random initialization vectors (IV). This is just a fancy word for random data, about the size of a block (128 bits). Think of it as an encrypted salt, that is, IV is publicly available, random and can only be used once. Note, however, that because CBC will xor the ciphertext rather than the plaintext of the previous plaintext, IV will not only prevent the decryption of the first block.
When transferring or holding data, only IV is typically added to the actual cipher message. If you’re interested in using AES-CBC correctly, read Part 2 of this series.
Counting mode (CTR)
Another option is to use the CTR mode. This pattern is interesting because it converts the password to a password stream, which means no padding is required. In its basic form, all blocks are numbered 0 through N. Each block will now be encrypted using the key, IV (also called nonce here), and the value of the counter.
Unlike CBC, it has the advantage that it can be encrypted in parallel and all blocks depend on IV, not just the first one. A serious caveat is that IV should never be reused by the same key, because an attacker can easily calculate which key you are using.
Can I make sure that no one can modify my message?
Fact: Encryption does not automatically prevent data modification. This is actually a very common attack. For a fuller discussion of the issue, read this article.
So what can we do about it? We simply add the encrypted verification code (MAC) to the encrypted message. A MAC is similar to a digital signature, except that the authentication and authentication keys are virtually the same. There are variations on this approach, and the pattern recommended by most researchers is called encrypt-then-MAC. That is, after encryption, the MAC is computed and appended to the ciphertext. You usually use hash-based message Authentication code (HMAC) as the MAC type.
Now it starts to get complicated. For completeness/authenticity we must select the MAC algorithm, select the cryptographic label mode, compute the MAC and attach it. This operation is slow because the entire message must be processed twice. The reverse operation must be the same as before, but used for decryption and validation.
Use GCM for authentication and encryption
Wouldn’t it be nice if there were a schema that handled all the authentication? Fortunately, there is an encryption method called authentication encryption, which also guarantees the confidentiality, integrity, and authenticity of data. One of the most popular block modes to support this is Galois/Counter Mode or GCM (for example, it can use the password component in TLS V1.2).
GCM is based on CTR mode and it also computes the authentication tokens sequentially during encryption. The tag is then usually appended to the ciphertext. Its size is an important security attribute, so its length is at least 128 bits.
It can also validate additional information that is not included in the plaintext. This data is called associated data. Why is this useful? For example, encrypted data has a meta-attribute, which is the creation date used to check whether the content must be reloaded. An attacker can easily change the creation date, but if it is added as associated data, the CGM will validate this information and identify the change.
Heated discussion: How long keys to use?
The intuition would say: bigger is better – obviously, enforcing 256-bit random values is harder than 128-bit. As we currently understand it, forcing through all the values of 128-bit bytes requires an astronomical amount of energy and is unrealistic for anyone in a reasonable amount of time (looking at you, NSA). So the decision is essentially between infinite and infinite time.
AES actually comes in three different key sizes, as it was chosen as the US federal government’s annotated encryption algorithm for use in all areas controlled by the federal government “including the military”. (…). Hence the shrewd military chief’s idea that there should be three “levels of security”, so that the most important secrets can be encrypted using heavyweight methods, but lower-value data can be encrypted with more practical, lightweight algorithms. (…). Therefore, NIST decided to formally comply with the regulations (requiring three critical dimensions), but also to do the forward-looking thing (the lowest level must be impenetrable through observable technology) (source).
The argument is as follows: AES encrypted messages may not be corrupted by brute force destruction keys, but by other, less expensive (currently unknown) attacks. These attacks are just as harmful to 128-bit key mode as they are to 256-bit key mode, so choosing a larger key size won’t help in this case.
So basically the 128-bit key is secure enough for most use cases, except for quantum computer protection. Also use 128-bit encryption, which is faster than 256-bit. The key strength of a 128-bit key seems to provide better protection against related key attacks (but this is irrelevant for most practical purposes).
Side note: Bypass attacks
Bypass attacks are attacks that exploit implementation-specific problems. Cryptographic schemes by themselves do not protect them effectively. Simple AES implementations can be vulnerable to timing, caching, and other attacks.
As a very basic example: A simple algorithm prone to timed attacks is an equals() method that compares two secret byte arrays. If equals() has a quick return, it means that after the first pair of unmatched bytes completes the loop, an attacker can measure how long it takes equals() to complete, and can guess byte by byte until they all match.
In this case, one fix is to use constant time equals. Note that writing constant time code in an interpreted language such as the JVM is not always easy.
Timing and caching attacks against AES are not just theoretical, but can even be implemented over a network. While preventing bypass attacks is primarily a concern for developers implementing cryptography primitives, it is wise to understand that coding practices may be detrimental to the security of the entire routine. The most general theme is that observable time-related behavior should not depend on private data. In addition, you should carefully consider the implementation you choose. For example, Java 8+ with OpenJDK and the default JCA provider should internally use Intel’s AES-NI instruction set, which protects against most timing and caching attacks by being constant in time and implemented in hardware (while still having good performance). Android uses its AndroidOpenSSLProvider and may internally use AES (ARM TrustZone) in hardware, depending on SoC. But I’m not convinced it has the same protection as Intels Pedant. But even if you improve your hardware, you can use other attack vectors, such as power analysis. Dedicated hardware exists to prevent most of these problems, the Hardware Security module (HSM). Unfortunately, these devices often cost thousands of dollars (fun fact: Your chip-based credit card is also HSM).
Implement AES-GCM in Java and Android
Finally it became practical. Now Java has all the tools we need, but the encryption API may not be the most straightforward. Careful developers may also be unsure about the length/size/default values to use. Note: Everything works equally for Java and Android if not noted.
In our example, we use a randomly generated 128-bit key. When passing 192 – and 256-bit keys, Java automatically selects the correct mode. Note, however, that 256-bit encryption usually requires an unrestricted permission file to be installed in the JRE (not on Android).
SecureRandom secureRandom = new SecureRandom();
byte[] key = new byte[16];
secureRandom.nextBytes(key);
SecretKey secretKey = SecretKeySpec(key, “AES”);
Copy the code
Then we have to create our initialization vector. For CGM, NIST recommends using 12 bytes (not 16!). Random word group because it’s faster and safer. Note that a strong pseudo-random number generator (RNG) like SecureRandom is always used.
byte[] iv = new byte[12]; //NEVER REUSE THIS IV WITH SAME KEY
secureRandom.nextBytes(iv);
Copy the code
Then initialize your password. Aes-gcm mode should work with most modern JRE and Android V2.3 and above (although it only works perfectly on SDK 21+). If you happen to be unavailable, install a custom encryption provider like BouncyCastle, but the default provider is usually preferred. We choose a 128-bit authentication label.
final Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
GCMParameterSpec parameterSpec = new GCMParameterSpec(128, iv); //128 bit auth tag length
cipher.init(Cipher.ENCRYPT_MODE, secretKey, parameterSpec);
Copy the code
Add optional associated data (such as metadata) if needed
if(associatedData ! =null) {
cipher.updateAAD(associatedData);
}
Copy the code
Encryption; If you are encrypting large chunks of data, look at CipherInputStream so that the entire content doesn’t need to be loaded into the heap.
byte[] cipherText = cipher.doFinal(plainText);
Copy the code
Now connect everything to a single message.
ByteBuffer byteBuffer = ByteBuffer.allocate(4 + iv.length + cipherText.length);
byteBuffer.putInt(iv.length);
byteBuffer.put(iv);
byteBuffer.put(cipherText);
byte[] cipherMessage = byteBuffer.array();
Copy the code
If you need a string representation, use Base64 to encode it. There is a standard implementation of this code in Android, and the JDK only starts with version 8 (I avoid Using Apache Commons Codec if possible because it is slow and messy to implement).
This is basically encryption. To construct the message, the IV length, IV, encrypted data, and authentication label are appended to a single byte array. (In Java, authentication tokens are automatically appended to messages and cannot be handled by standard encryption apis.)
The best event is to erase sensitive data such as an encryption key or IV from memory as quickly as possible. Since Java is a language with automatic memory management, there is no guarantee that the following will work as expected, but in most cases it should:
Arrays.fill(key,(byte) 0); //overwrite the content of key with zeros
Copy the code
Be careful not to overwrite data that is still used elsewhere.
Now comes the decryption part, which works like encryption by first deconstructing the message:
ByteBuffer byteBuffer = ByteBuffer.wrap(cipherMessage);
int ivLength = byteBuffer.getInt();
if(ivLength < 12 || ivLength >= 16) { // check input parameter
throw new IllegalArgumentException("invalid iv length");
}
byte[] iv = new byte[ivLength];
byteBuffer.get(iv);
byte[] cipherText = new byte[byteBuffer.remaining()];
byteBuffer.get(cipherText);
Copy the code
Verify input parameters, such as IV length, carefully, because an attacker may change the length value to something like 2³¹, which allocates 2 GiB of memory and can quickly fill up your heap, making denial-of-service attacks trivial.
Initialize the password and add optional associated data and decrypt:
final Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
cipher.init(Cipher.DECRYPT_MODE, new SecretKeySpec(key, "AES"), new GCMParameterSpec(128, iv));
if(associatedData ! =null) {
cipher.updateAAD(associatedData);
}
byte[] plainText= cipher.doFinal(cipherText);
Copy the code
That’s all. If you want to see a complete example, check out Armadillo, a project using AES-GCM that I hosted on Github.
conclusion
We need three properties to protect our data:
- Confidentiality: The ability to prevent eavesdroppers from discovering plaintext messages or information about plaintext messages.
- Integrity: The ability to prevent an attacker from modifying a message without legitimate users noticing.
- Authenticity: The ability to prove that a message was generated by a particular party and prevent new messages from being forged. This is usually provided through a message Validation code (MAC). Note that authenticity also means completeness.
AES with Galois/Counter (GCM) block mode provides all these properties and is fairly easy to use, and is available in most Java/Android environments. Please consider the following:
- Use a 12-byte initialization vector that is never used with the same key (using a strong pseudorandom generator like SecureRandom).
- Use 128-bit authentication token length.
- Use 128-bit key length (you’ll be fine!) .
- Consolidate everything into one message.
Further reading:
Best Security Practices: Symmetric encryption using AES in Java and Android: Part 2: AES-CBC + HMAC.
References:
- patrickfav/armadillo
- patrickfav/bytes-java