Skip to main content

Why is unauthenticated encryption insecure?

 https://cybergibbons.com/reverse-engineering-2/why-is-unauthenticated-encryption-insecure/

Cryptography is a complex subject. There are many subtle issues that can be introduced if you don’t know what you are doing.

There is a common mantra: “don’t roll your own crypto”. This is because both inexperienced and experienced developers frequently build cryptographic systems that are insecure.

However, there has to be a line – when does it start becoming “rolling your own”? Particularly in embedded systems, there are times when custom protocols need to be used, and developers stray into the dangerous area of cryptography.

One of the most common mistakes we have seen is the use of unauthenticated encryption.

What is encryption?

Encryption is encoding a plaintext into a ciphertext using a key, with the goal of keeping the plaintext confidential.

Only someone with the correct key should be able to decrypt the ciphertext and turn it back into plaintext.

Encryption provides confidentiality. It stops someone working out what the message is.

So what’s the issue?

An attacker can modify the ciphertext and cause the plaintext to change. There is no inherent means in encryption to detect this change.

Encryption does not provide authenticity. You cannot check that the message is genuine and has not been tampered with.

What can an attacker do with this?

I’m going to describe one attack against unauthenticated encryption.

Many encryption algorithms only operate on fixed-size blocks of data – they are called block ciphers. To encrypt longer lengths of data, a mode of operation is used to apply the block cipher repeatedly.

One mode of operation is called CBC (Cipher Block Chaining). When encrypting the data, the previous ciphertext block is mixed into the current plaintext block using an operation called “exclusive OR“. This is denoted with the + in a circle in diagrams.

There is also an input called the initialisation vector, or IV. This is a random input to the algorithm, and is intended to ensure that the ciphertext is different, even if the same plaintext is encrypted. This prevents leaking information about the content.

The initialisation vector is transmitted alongside the ciphertext.

Decryption is similar. The previous ciphertext block is exclusive ORed with the output of the block cipher to obtain the plaintext.

Exclusive OR is a deterministic operation. If we look at a single bit, then it operates as follows:

ABOutput
000
011
101
110

I always think of this as “if one input is high, invert the other input, otherwise leave it alone”.

The operation is carried out for each bit in a byte.

A: 0 1 0 1 1 0 0 1 (0x59)
B: 1 1 1 1 0 0 0 0 (0xF0)
O: 1 0 1 0 1 0 0 1 (0xA9)

What this means is that modifying one of the inputs to exclusive OR results in a predictable change to the output. And the operation can be easily reversed.

A: 0123456789ABCDEF
B: FFFF00FFF00F0FF0
O: FEDC459879A4C21F

If we now exclusive OR the output with one of the inputs:

A: FEDC459879A4C21F
B: FFFF00FFF00F0FF0
O: 0123456789ABCDEF

Hopefully that explains exclusive OR.

Let’s look back to how CBC uses this in decryption. In the first block, the IV is exclusive ORed with the output of the block cipher. The IV is transmitted alongside the ciphertext and an attacker can modify both at at will.

We can encrypt the string “A dog’s breakfast” using a key and the initialisation vector of all 0x00 (here on CyberChef).

Key: 0123456789ABCDEF0123456789ABCDEF
IV:  0000000000000000000000000000000
Plaintext: A dog's breakfast
Ciphertext: c7b1d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50

Of course, this can be decrypted (here on CyberChef).

If I change just one byte in the ciphertext, the entire message is corrupted (here on Cyberchef). There’s no way for me to predictably modify this plaintext by changing the ciphertext.

Key: 0123456789ABCDEF0123456789ABCDEF
IV:  0000000000000000000000000000000
Ciphertext: c7b2d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50
Plaintext: .L...Q½êU...ì7Ò.t

But the attacker also has control over the IV. Let’s set the first byte of the IV to 0xFF (here on CyberChef). Only the first byte of the plaintext has changed!

Key: 0123456789ABCDEF0123456789ABCDEF
IV:  FF00000000000000000000000000000
Ciphertext: c7b1d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50
Plaintext: ¾ dog's breakfast

And it has changed predictably. The capital A (ASCII 0x41) has been exclusive ORed with 0xFF to become 0xBE (which decodes as ¾ although it’s above the normal ASCII range).

A: 0 1 0 0 0 0 0 1 (0x41)
B: 1 1 1 1 1 1 1 1 (0xFF)
O: 1 0 1 1 1 1 1 0 (0xBE)

This is a very high level of control! The attacker can now modify the plaintext without detection. Let’s try and significantly change the meaning of it.

The original message contained “A dog’s breakfast”. Can we change this canine feast into a feline one?

We exclusive OR the original plaintext with the desired one (here on CyberChef). Notice how the output only has value for the characters we have changed.

Original: A. .d.o.g.'.s. .b.r.e.a.k.f.a.s.t.
Original: 4120646f67277320627265616b66617374
Desired:  A. .c.a.t.'.s. .b.r.e.a.k.f.a.s.t.
Desired:  4120636174277320627265616b66617374
Output:   0000070e13000000000000000000000000

Pop that output in as the IV to the decryption, and we’ve successfully changed the message (here on CyberChef). All of this without even knowing the key.

Key: 0123456789ABCDEF0123456789ABCDEF
IV:  0000070e130000000000000000000000
Ciphertext: c7b1d96f0f520f33faaccfdc107f718aafe8892c3a29c76b0732a760a0f54f50
Plaintext: A cat's breakfast

Of course, the attacker needs to have knowledge of the plaintext to make use of this attack. However, it’s extremely common for some or all of the message to be known. For example, when we visit most websites, the first part of the response will be “HTTP/1.1 200 OK”. If this was only protected by CBC encryption, we could change that to “HTTP/1.1 404 No”, changing the behaviour of the browser (here on CyberChef).

This doesn’t just impact the first block of data either. After the first block, instead of the IV, the previous ciphertext block is used in the exclusive OR operation. The attacker can modify the ciphertext and end up controlling the plaintext.

This comes at a cost though – the previous plaintext block will be totally corrupted as a result.

To illustrate this, we can encrypt a longer block of text (here on CyberChef).

Let’s change “baud” to “cats”. We need to locate the correct place in the ciphertext. AES (the encryption algorithm we are using) works in 16 byte blocks. The word “baud” is 85 characters in, so in the 6th block. We therefore want to modify the 5th block of ciphertext.

The exclusive OR is a bit more complex than last time – we now need to exclusive OR the ciphertext, the original text, and the desired text (here on CyberChef). But change those 4 bytes, and we change the word “baud” to “cats”.

The only issue is, as expected, the previous block has been entirely corrupted. Whilst in this case, it’s made part of the message nonsensical, it frequently has no impact when carrying out attacks.

But there are worse problems?

The above issue allows an attacker to modify the plaintext without detection. This would be an issue in certain situations, such as lock/unlock messages to a door.

But not authenticating your encryption can lead to worse issues. A type of attack called padding oracle attacks can let an attacker obtain the encryption key by sending a large number of specially crafted packets.

Block ciphers only operated on fixed blocks. If the data is shorter than a block, it must be padded. There are a number of ways of doing this, such as appending the number of padding bytes (e.g. 0x02 0x02 or 0x05 0x05 0x05 0x05 0x05). The process of decryption may check this padding is correct or not, and respond differently in each case.

An attacker can exploit these differential responses to leak the key. This can break both the confidentiality and integrity of messages.

What’s the solution to this?

Encryption should always be authenticated. There are two common solutions to this:

  • Add a Message Authentication Code (MAC). This is a keyed cryptographic checksum that provides authenticity and integrity.
  • Use an authenticated mode of operation such as GCM.

Even with this advice, there are many pitfalls. Applying the authentication and encryption in the wrong order can lead to weaknesses; this is so common that it has been deemed the Cryptographic Doom Principle.

Generally, developers shouldn’t be working with cryptography at this level unless they are suitably skilled. That’s easy to say, harder to put into action. There is a big movement to make use of secure-by-default cryptographic libraries and APIs that provide developers with useful functions without giving them so much rope they can hang themselves.

There are scant few reasons for not authenticating encryption.

 

Comments

Popular posts from this blog

The Difference Between LEGO MINDSTORMS EV3 Home Edition (#31313) and LEGO MINDSTORMS Education EV3 (#45544)

http://robotsquare.com/2013/11/25/difference-between-ev3-home-edition-and-education-ev3/ This article covers the difference between the LEGO MINDSTORMS EV3 Home Edition and LEGO MINDSTORMS Education EV3 products. Other articles in the ‘difference between’ series: * The difference and compatibility between EV3 and NXT ( link ) * The difference between NXT Home Edition and NXT Education products ( link ) One robotics platform, two targets The LEGO MINDSTORMS EV3 robotics platform has been developed for two different target audiences. We have home users (children and hobbyists) and educational users (students and teachers). LEGO has designed a base set for each group, as well as several add on sets. There isn’t a clear line between home users and educational users, though. It’s fine to use the Education set at home, and it’s fine to use the Home Edition set at school. This article aims to clarify the differences between the two product lines so you can decide which...

Let’s ban PowerPoint in lectures – it makes students more stupid and professors more boring

https://theconversation.com/lets-ban-powerpoint-in-lectures-it-makes-students-more-stupid-and-professors-more-boring-36183 Reading bullet points off a screen doesn't teach anyone anything. Author Bent Meier Sørensen Professor in Philosophy and Business at Copenhagen Business School Disclosure Statement Bent Meier Sørensen does not work for, consult to, own shares in or receive funding from any company or organisation that would benefit from this article, and has no relevant affiliations. The Conversation is funded by CSIRO, Melbourne, Monash, RMIT, UTS, UWA, ACU, ANU, ASB, Baker IDI, Canberra, CDU, Curtin, Deakin, ECU, Flinders, Griffith, the Harry Perkins Institute, JCU, La Trobe, Massey, Murdoch, Newcastle, UQ, QUT, SAHMRI, Swinburne, Sydney, UNDA, UNE, UniSA, UNSW, USC, USQ, UTAS, UWS, VU and Wollongong. ...

Logic Analyzer with STM32 Boards

https://sysprogs.com/w/how-we-turned-8-popular-stm32-boards-into-powerful-logic-analyzers/ How We Turned 8 Popular STM32 Boards into Powerful Logic Analyzers March 23, 2017 Ivan Shcherbakov The idea of making a “soft logic analyzer” that will run on top of popular prototyping boards has been crossing my mind since we first got acquainted with the STM32 Discovery and Nucleo boards. The STM32 GPIO is blazingly fast and the built-in DMA controller looks powerful enough to handle high bandwidths. So having that in mind, we spent several months perfecting both software and firmware side and here is what we got in the end. Capturing the signals The main challenge when using a microcontroller like STM32 as a core of a logic analyzer is dealing with sampling irregularities. Unlike FPGA-based analyzers, the microcontroller has to share the same resources to load instructions from memory, read/write th...