great article for more details.
After some trial and error and based on this assumption, I came up with the following first decryption round power leakage model.
Where:
This may sound complex, but this simply translates the utilization of a pre-computed look-up-table, as defined in the previous linked article.
This model does work, but a simple “trick” does make it perform even better. The idea, borrowed from Tian et al. (2014), is to introduce some kind of non-linearity to the model.
Hence, let’s define:
Where is a value that can be determined empirically. In the case of the ESP32-C3
, gave me satisfying results.
This model is similar to the one used for the first round. Of course, the results of this first round must be computed, so knowledge of is assumed.
Where:
Capturing power consumption traces for the ESP32-C3
doesn’t differ much from doing it for the ESP32
.
After configuring the target ESP32-C3
to enable flash encryption, more than 800k measurement cycles have been performed.
For some reason, I’ve found the ESP32-C3
to boot slower than the ESP32
. Gathering these samples took around five days with the ESP CPA Board.
The analysis process doesn’t change much either. As explained above, the power consumption models need to be updated. However, the pre-processing and CPA methods are roughly identical.
The key ranks obtained for the first round are provided below.
All bytes, but 0, 4, and 6 can be successfully recovered.
This considerably reduces the search space already, and the remaining three bytes, which still achieved a low ranking, could be brute-forced.
Also, it’s reasonably clear from examining the evolution of all correlations which bytes have been correctly guessed, and which ones might require a bit of trial and error.
For instance, while the result is obvious for byte 8, we have two potential candidates for byte 0.
On the other hand, for the second round, all 16 bytes of the key can directly be recovered.
Given these results, the successful recovery of and all associated with the attacked data block seems possible.
Detailed results for this attack, as well as supplementary information, are available on the ESP32-C3 Detailed Results page.
As implied by Espressif’s advisory, the ESP32-C3
is also vulnerable to side-channel attacks. However, the use of the XTS encryption mode does indeed make things more difficult. It seems a new attack and its associated measurements are required to crack each 128-byte block of the flash data.
That being said, my next article will demonstrate that having control over the first 128 bytes of the flash data is sufficient to execute arbitrary code, even when the Secure Boot feature of the system is enabled. This code could be used to extract the rest of the content.
ESP32-C6
In its advisory, Espressif indicates:
We will incorporate hardware countermeasures in our future chips to address these vulnerabilities.
The ESP32-C6
is one of these “future” chips, and its datasheet indicates the following feature:
configurable Anti-DPA
Where DPA is a side-channel attack similar to the CPA method we have used so far.
I was curious to test my attacks against this new component, expecting to fail miserably.
The ESP32-C6
utilizes the same encryption method as implemented in the ESP32-C3
. Nothing has changed, XTS-AES is still used.
However, according to the technical reference manual, several countermeasures against side-channel attacks have been put into place.
ESP32-C6 has a dual protection mechanism against Differential Power Analysis (DPA) attacks at the hardware level.
- First, a mask mechanism is introduced in the symmetric encryption operation process, which interferes with the power consumption trajectory by masking the real data in the operation process. This security mechanism cannot be turned off.
- Second, the clock selected for the operation will change dynamically in real time, blurring the power consumption trajectory during the operation. For this security mechanism, ESP32-C6 provides 4 security levels for users to choose to adapt to different applications.
The first countermeasure is called the masking countermeasure. The idea is to avoid directly manipulating “sensitive” data. Therefore, the values we relied upon in our previous attacks (such as the output of the lookup table) probably aren’t directly handled by the chip. Instead, they could be XOR-ed with random values, called masks. Carrying out side-channel attacks against such countermeasures, while possible, seems significantly more complicated.
The second optional method detailed by Espressif is a hiding countermeasure. The idea is to randomize the clock of the crypto-core of the ESP32-C6
. The following table is shared in the manual.
Security Level Name | Configuration Fuse Value | Crypto clock frequency |
---|---|---|
SEC_DPA_OFF | 0 | |
SEC_DPA_LOW | 1 | |
SEC_DPA_MIDDLE | 2 | |
SEC_DPA_HIGH | 3 |
Hence, at a given timestamp, power traces won’t necessarily reflect the same operation. This can significantly diminish the effectiveness of a CPA attack, unless a method to re-synchronize traces can be developed.
To assess the effectiveness of these countermeasures, several tests have been performed, starting from the security level SEC_DPA_OFF
and going all the way up to SEC_DPA_HIGH
.
SEC_DPA_OFF
)First, the target ESP32-C6
was configured with the flash encryption feature enabled, but with its security level set to SEC_DPA_OFF
.
In this configuration:
My goal was to mount a similar attack to the one I used against the ESP32-C3
, and recover enough traces to decrypt a 128-byte block of data (i.e., I wanted to obtain the key along with the associated tweaks).
This might be a bit disappointing, but I didn’t have to do much to bypass the new masking countermeasure.
In a somewhat naive manner, I initially used the same power consumption models as for the ESP32-C3
, and surprisingly obtained decent results!
Therefore, the models are once again, for the first round:
And for the second:
I managed to improve the results by tweaking the value of the constant. While worked well for the ESP32-C3
, suited the ESP32-C6
better.
As usual, the following ranking graphs for the first two rounds can be obtained.
Detailed results for this attack, as well as supplementary information, are available on the ESP32-C6 Detailed Results page.
I’m a bit puzzled by this initial success, and I’m not exactly sure why the countermeasure seems so inefficient. I suspect the random masks could be manipulated at the same time as the masked values. Such behavior is sometimes discussed in the literature (keyword: Zero-Offset 2DPA
, see for instance Waddle and Wagner (2004)).
Given the success I achieved in “bypassing” the masking countermeasure, I began studying the clock randomization.
Second, the clock selected for the operation will change dynamically in real time, blurring the power consumption trajectory during the operation. For this security mechanism, ESP32-C6 provides 4 security levels for users to choose to adapt to different applications.
Security Level Name | Configuration Fuse Value | Crypto clock frequency |
---|---|---|
SEC_DPA_OFF | 0 | |
SEC_DPA_LOW | 1 | |
SEC_DPA_MIDDLE | 2 | |
SEC_DPA_HIGH | 3 |
According to the reference manual, it appears that the countermeasure causes the clock to randomly vary within the specified frequency ranges. Achieving this behavior with digital logic alone seems complex to me, unless a PLL circuit is employed.
Instead, I hypothesize that the clock of the crypto-core is simply randomly gated. The probability of this clock being disabled likely depends on the security level; the higher the level, the more often the clock is disabled. The resulting average frequencies could fall within the indicated ranges.
If, during a particular system clock tick, a difference between the crypto-core being clocked and the crypto-core being gated can be observed, an attacker could accurately predict when leakages are likely to occur.
The following graph depicts, at a given timestamp, the distributions of the measured samples under two configurations:
SEC_DPA_OFF
, the clock randomization isn’t enabled.SEC_DPA_LOW
, the clock randomization is enabled.A clear difference can be observed between the two distributions. It seems reasonable to assume that, for the SEC_DPA_LOW
security level, all samples below 120 units of current (i.e., when the system consumes less current than this threshold) correspond to instances when the crypto-core clock was gated.
With this understanding, it becomes possible to classify the traces into distinct groups, each corresponding to a similar pattern of the crypto-core clock.
Conducting a CPA attack for each of these groups is likely to yield strong results, as all traces within a group are synchronized.
The correlation values obtained from each attack can be combined (multiplied) to derive the final correlation results.
SEC_DPA_LOW
ResultsThe target has been configured with the SEC_DPA_LOW
security level, and once again, around 600k measurements cycles have been performed. For both of the targeted rounds, all 16 bytes can successfully be recovered.
Detailed results for this attack, as well as supplementary information, are available on the ESP32-C6 (SEC_DPA_LOW) Detailed Results page.
SEC_DPA_HIGH
ResultsConfiguring the system with SEC_DPA_HIGH
yields very similar results, and attacking the first two rounds enables the recovery of all bytes.
Detailed results for this attack, as well as supplementary information, are available on the ESP32-C6 (SEC_DPA_HIGH) Detailed Results page.
The countermeasures implemented to protect the ESP32-C6
against side-channel attacks don’t appear to be effective. The masking countermeasure doesn’t seem to have much impact, while the hiding countermeasure can be undermined by guessing the behavior of the crypto-clock.
Considering the above, I would say the ESP32-C6
does not look much more secure than the ESP32-C3
.
With that said, the XTS mode of encryption forces once again to perform a new attack for every 128-byte block.
However, my next article will demonstrate that having control over a limited amount of flash data is sufficient to execute arbitrary code and extract the remaining content, even when the Secure Boot feature of the system is enabled.
Here, the publicly known side-channel attack against the ESP32
has been replicated using only inexpensive and custom hardware.
As suspected, the ESP32-C3
is also affected by a similar side-channel attack. However, the use of the XTS mode of encryption makes things more difficult for the attacker, and a given attack can only be employed to decrypt a specific 128-byte block at a time.
The ESP32-C6
supposedly embedded side-channel countermeasures in the form of:
These countermeasures don’t appear to be effective, and the ESP32-C6
can be exploited similarly to the ESP32-C3
, while still benefiting from the added security of the XTS mode of encryption.
However, the next article in this series will demonstrate that decrypting only a few of the XTS 128-byte blocks can be sufficient to extract the cleartext content of the entire external flash.
Espressif has been contacted to disclose the attack detailed in this article, along with the method disclosed in the next one.
The process outlined in the Espressif Security Incident Response Document has been followed.
Following a discussion with Espressif, a more thorough assessment of the leakages originating from the ESP32-C3
and ESP32-C6
has been performed.
The goal was to understand the actual impact of the masking countermeasure implemented in the ESP32-C6
.
The results are available below:
Karim M. Abdellatif, Olivier Hériveaux, and Adrian Thillard. Unlimited results: breaking firmware encryption of esp32-v3. Cryptology ePrint Archive, Paper 2023/090, 2023. URL: https://eprint.iacr.org/2023/090. ↩ 1 2 3 4 5
Chao Luo, Yunsi Fei, and A. Adam Ding. Side-channel power analysis of xts-aes. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017, volume, 1330–1335. 2017. doi:10.23919/DATE.2017.7927199. ↩
Qizhi Tian, Maire O'neill, and Neil Hanley. Can leakage models be more efficient? non-linear models in side channel attacks. In 2014 IEEE International Workshop on Information Forensics and Security (WIFS), volume, 215–220. 2014. doi:10.1109/WIFS.2014.7084330. ↩
Jason Waddle and David Wagner. Towards efficient second-order power analysis. In Cryptographic Hardware and Embedded Systems - CHES 2004: 6th International Workshop Cambridge, MA, USA, August 11-13, 2004. Proceedings, volume 3156 of Lecture Notes in Computer Science, 1–15. Springer, 2004. URL: https://iacr.org/archive/ches2004/31560001/31560001.pdf, doi:10.1007/978-3-540-28632-5_1. ↩