Cache-based Targeted Deanonymization Attacks

On the Internet, the casual person surfing a website has a reasonable expectation that their identity remains private. We reveal new cache-based target deanonymization attacks which threaten user anonymity: An attacker who has complete or partial control over a website can learn whether a specific target (i.e., a unique individual) is browsing the website. The attacker knows this target only through a public identifier, such as an email address or a Twitter handle.

The attacks leverage the sharing/blocking functionality provided by resource-sharing services such as YouTube, Google Drive, Dropbox, or Twitter. The target user is assumed to be logged into such a sharing service. The attacks exploit the CPU cache side channel on the target’s device, and can bypass isolation mechanisms and various defenses deployed by browser vendors or resource-sharing services.

We evaluated the attacks on multiple hardware microarchitectures, multiple operating systems and multiple browser versions, including the highly-secure Tor Browser, and demonstrated practical targeted deanonymization attacks on major sites, including Google, Twitter, LinkedIn, TikTok, Facebook, Instagram and Reddit. The attack runs in less than 3 seconds in most cases, and can be scaled to target a large number of users.

Research Paper

A paper describing the attacks will appear in the 31st USENIX Security Symposium (Boston, 10–12 August, 2022) with the following title:

Targeted Deanonymization via the Cache Side Channel: Attacks and Defenses (BibTex entry for citations)

A preprint of the paper is available here. The paper is the result of a collaboration between a group of researchers at the New Jersey Institute of Technology: Mojtaba Zaheri, Yossi Oren, and Reza Curtmola.

Questions and Answers

Why is this relevant and should I be worried?
What is the impact of these targeted deanonymization attacks?
Who is vulnerable?
Can you provide an overview of the attack?
Is this a new attack?
How can I tell if I am being targeted?
What is a CPU side channel attack?
Did you release the source code for the attack?
Is there a defense against these attacks?
When did you disclose the attacks?
Do the affected browsers vendors plan to mitigate the attacks?
Do you have any other recommendations to limit the attack’s effectiveness?
Where else can I learn about this attack?

Why is this relevant and should I be worried?

The attack impact may be different for different categories of Internet users. If you are an average user, you may not perceive this as a privacy threat (although this is highly dependent on each user’s particular circumstances). However, if you belong to certain categories of users, then you may be significantly impacted. Individuals who organize and participate in political protest, who work as journalists reporting on inconvenient topics, network with fellow members of their minority group, or even purchase embarrassing or potentially incriminating personal items, may risk their life and liberty if their identity becomes known to malicious actors.

What is the impact of these targeted deanonymization attacks?

When you visit a random website, a malicious actor who has complete or partial control over the website is able to learn that it is you in particular who are currently browsing the website. For this, the malicious actor only needs to know a public identifier associated with you, such as your email address or your Twitter handle. The attack also requires that you are logged into one of many resource-sharing services, such as YoutTube (or any of the Google properties such as Google Drive, GMail, Google Photos, etc.), Facebook, Twitter, etc. Many users routinely stay persistently logged into such services.

Knowing the precise identity of the person who is currently visiting a website can be the starting point for a range of nefarious targeted activities that can be executed by the operator of that website (for example, surveillance-for-hire activities).

Motivating example: Consider a state-sponsored adversary who has purchased, at great expense, a zero-day exploit, which it wishes to install on the computer of a journalist with a well-known Twitter handle. The adversary has also compelled a local website to include code that can install this exploit. If this exploit were to be installed on many devices, however, this would increase the risk of the exploit being detected by white-hat security researchers. Therefore, the state adversary wishes to first verify, using the well-known Twitter handle, that the user currently connecting to the website is the target journalist, and only then to deploy its exploit.

Motivating example: Consider the case where a law enforcement agency has covertly taken control of an underground extremist forum. The agency wishes to identify the users of this forum, but these users use pseudonyms to connect to the forum. The agency, however, has also gathered a list of Facebook accounts who are suspected to be users of this forum. The law enforcement agency would like to cross-reference the pseudonyms with this list of potential suspects.

Who is vulnerable?

Our work reveals that the attack surface for targeted deanonymization attacks is drastically larger than previously considered. We experimentally show that the attack can be executed on a diverse set of targets including desktop and mobile systems with multiple CPU microarchitectures (Intel, Apple M1, Qualcomm CPUs), multiple operating systems (Windows, macOS, Android), multiple browsers (Chrome, Safari, Firefox, Tor Browser), and multiple highly popular resource sharing services (Google, Twitter, LinkedIn, TitTok, Facebook, Instagram, Reddit). When considering together the collection of users of these services, we conclude that a large majority of Internet users are vulnerable.

Can you provide an overview of the attack?

The attack builds on a leaky resource attack by using CPU cache side channels instead of cross-site leaks. In this way, the range of attack scenarios (i.e., affected browsers and resource-sharing services) is considerably larger.

The attack consists of two phases, setup and execution. In the setup phase, the attacker uploads a resource to a resource-sharing service, and then binds it to the victim’s identity. There are two approaches to perform this binding. In the sharing-based approach, the attacker privately shares the resource with the target (e.g., by using the victim’s email address or user ID with the service). In the blocking-based approach, the attacker makes the resource public, and then blocks the target from viewing any resources owned by the attacker. Next, the attacker embeds this resource into an attacker-controlled webpage.

In the execution phase, the attacker causes the target to visit the attack page (steps 1 and 2). As the target’s browser renders the page, it makes a cross-site request for the embedded resource to the sharing service (steps 3 and 4), passing the user’s authentication cookies. The response of the sharing website to this request depends on the target’s identity. With the sharing-based approach, the response to this cross-site request contains the shared resource if the user is the target, and an error otherwise (step 5). With the blocking-based approach, the opposite happens – the response contains an error for the blocked target, and the shared resource for other users.

The cache-based targeted deanonymization attack (sharing-based approach).

In the final step of the attack, the attacker needs to discover whether the shared resource was loaded. The Same-Origin policy prevents the attacker from directly reading out the cross-origin response. However, by using a browser-based CPU cache side channel, the attacker is able to learn if the resource has been loaded or not. Specifically, the attacker page uses Javascript to measure the contention to the CPU cache (step 6) – depending on whether the resource is being loaded or not, the contention to access the CPU cache will have distinct patterns.

The following figure illustrates the concept of our attack. In the figure, we see that the cache side channel traces for targeted and non-targeted users start to diverge around the 200 ms point. An attacker observing the side-channel trace can quickly and effectively tell apart target and non-target users through the cache side channel, without relying on any cross-site leaks.

A Proof-of-Concept Attack. The side-channel trace shows significant differences between the target and non-target state after less than 1 second.

For more details, please refer to our technical paper.

Is this a new attack?

Yes. Unlike previous targeted deanonymization attacks, we do not assume the existence of a cross-site leak. Instead, our approach uses the CPU cache side channel. In addition, we introduce several novel techniques to significantly increase the potential impact of these attacks: we increase the attack’s target population by applying it to highly-popular services which have no currently-exploitable cross-site leaks, including GMail, Twitter and Facebook; we also successfully execute the attack on browsers that have a strict policy of not allowing cookies to be attached to cross-site requests, including Safari and Tor Browser.

Targeted deanonymization attacks can be leveraged to uniquely identify a target. Other known types of de-anonymization techniques, such as third-party tracking (e.g., tracking pixels or tracking IPs) or social media fingerprinting, are more coarse-grained and do not provide this level of accuracy.

How can I tell if I am being targeted?

Most of the techniques used in a cache-based targeted deanonymization attack are quite innocuous. For example, loading a media resource happens frequently and does not usually raise suspicion. For some combination of browsers and sharing services, the attack needs to open a new browser tab or a new browser window, but in those cases the attack takes specific steps to put the new tab/window in the background, making the attack less noticeable by the user. The actual measuring of the CPU cache is stealthy and users have no indication they are being targeted.

What is a CPU side channel attack?

A side channel is a mechanism to learn (usually private) information indirectly, by analyzing the use of a shared resource. Side-channel attacks are attacks that analyze the physical implementation artifacts of a system in order to gain an insight into its secret internal state. In our setting, we use microarchitectural cache attacks, which allow a spy process to observe the memory access patterns of a victim process over time, and use these access patterns to discover secrets about the victim.

Did you release the source code for the attack?

To help the research community understand our discovered attacks and design ways to defend against them, we have released the source code and other technical artifacts in a public repository, which is available at GitHub

Is there a defense against these attacks?

Unfortunately, we are not aware of a counter-measure that provides 100% protection while also preserving usability and efficiency. We provide several options to mitigate the attack:

Leakuidator+ is a browser extension which can successfully block the attack. It is available on the Chrome Web Store (for the Chrome browser) and on the Firefox Add-ons website (for the Firefox browser). The extension was shown to incur a small performance overhead. Still, there are certain browsing scenarios in which the user experience may be negatively affected by the use of the extension. In addition, the extension is not available for the Apple Safari and Chrome Android browsers.
A new potential defense strategy against some of the attack variants emerged from our discussions with the affected services and the browser vendors. We initiated a proposal to extend the W3C standard for fetch metadata HTTP request headers.
In our technical article, we provide an extensive set of recommendations for sharing service owners, browser vendors, and Internet end users.

When did you disclose the attacks?

We disclosed our findings, together with proof-of-concept code, to the affected browser vendors (Google Chrome, Mozilla Firefox, Apple Safari and WebKit) and sharing services (Google, Twitter, LinkedIn, TikTok, Facebook, Instagram and Reddit) in January 2022. At the time of the writing, the following reports were still classified:

https://bugs.chromium.org/p/chromium/issues/detail?id=1285604 (Chromium)

http://bugzilla.mozilla.org/show_bug.cgi?id=1749129 (Firefox)

Do the affected browsers vendors plan to mitigate the attacks?

CPU cache side channels are powerful attacks that bypass software-imposed boundaries. As such, there is no immediate fix to our attacks that does not dramatically affect user browsing experience. We are discussing possible short and long-term countermeasures with browser vendors, but this is a slow process that will likely take time.

Do you have any other recommendations to limit the attack’s effectiveness?

In our technical article, we provide an extensive set of recommendations for sharing service owners, browser vendors, and Internet end users. We summarize here the guidance for end users:

Install our browser extension (Leakuidator+)
Avoid unnecessary logins into sharing services
Consider using multiple devices
Several browsers offer protection against variants of the attack. For example, Safari, Tor Browser, and Firefox disable third-party cookies for cross-site requests by default.