TL;DR:
Passkeys can be vulnerable to spear phishing attacks if an attacker’s device can be within Bluetooth Low Energy range of their victim while the victim interacts with the phishing page. Such attacks can be prevented by strongly limiting what authenticators are used during passkey creation or by adding more authentication layers.
One of the features offered by Passkeys is cross-device authentication (CDA) via “just scanning a QR code“. But how does this actually work? Does it deliver on the strong phishing resistance that passkeys promise?
Quick recap – What are passkeys?
Passkeys are a replacement for passwords that provide more security and a better UX at the same time. Their core features are
- Convenience – Being built on web standards means that developers and users both benefit from system-level and convenient UIs and APIs
 - Built-in cross-device authentication – Users can authenticate on new devices that don’t have their credentials using an existing device that does have their credentials (e.g., by scanning a QR code on their phone) without any additional development cost
 - Uniqueness – This eliminates password reuse issues (unless the passkey is still used alongside a regular password)
 - Resistance to data leaks – As passkeys are based on public/private key cryptography, a data leak would only contain the passkey’s public key, thus not endangering its security guarantees
 - Potentially providing 2FA in one authentication flow – Passkey providers can combine biometric authentication or PIN verification with the user’s access to the provider itself
 - Resistance to phishing – Users can only use passkeys on the website they were created for, meaning users are unable to sign in to phishing sites
 
Many of these features are a consequence of passkeys being built on existing web standards, specifically WebAuthn 2 and CTAP 2.2.
Passkeys are discoverable credentials
Some examples to clarify the WebAuthn terms Authenticator, Client, and Relying Party
From a technical perspective, passkeys are FIDO2/WebAuthn discoverable credentials. As opposed to passwords, they consist of a private/public key pair along with some metadata. During creation, they are bound to a specific relying party (abbr. RP, usually a domain such as github.com) and are stored in an authenticator. When the user wants to use a passkey to sign in to a website, the authenticator signs a challenge from the relying party after making sure that the passkey was created for this specific relying party.
Thanks to being discoverable credentials, the UX for this process is very simple. Websites can simply have a “Sign In with Passkey“ button where users select one of their passkeys for the given RP to complete authentication without any manual entry of account details.
Why are passkeys resistant to phishing?
Just like any WebAuthn credential, passkeys are bound to an RP. The authenticator ensures that any request for authentication is only answered with passkeys matching the RP of that request. And in order to prevent websites from maliciously requesting authentication for other RPs, the WebAuthn client (usually the browser) only allows websites to request authentication for their exact domain or a superdomain (slightly simplified for better understandability). For example, mail.google.com is allowed to request authentication for the RP “google.com“ but not for “calendar.google.com“ or “github.com“. So even if a user wanted to, they wouldn’t be able to use their passkey on a phishing domain.
Relying parties can require authenticators to perform user presence (UP) or user verification (UV) checks, e.g., biometric authentication or device PIN verification. While system-level authenticators usually follow this requirement correctly, many third-party password managers do not properly respect this aspect of the specification. Hence, some threat models may require limiting what authenticators are allowed to store credentials for a certain RP.
Passkey authentication from start to finish
The following describes a simplified version of the authentication process. This simplification allows us to focus on the aspects that are relevant to our analysis. The full process is specified here.
Server and website responsibilities
Usually, the flow starts with the user signaling they would like to sign in to the service (e.g., by clicking on a “Sign In with Passkey“ button). The RP’s server must then generate some PublicKeyCredentialRequestOptions specifying the following:
- The RP ID, which will limit which credentials can be used for this authentication
 - A cryptographic challenge for the authenticator to sign using the credential’s private key
 - Limitations to which credentials the RP will accept
 - Whether the authenticator should perform UP or UV checks, such as biometric authentication
 - Some other properties
 
These options are then sent to the RP’s client-side code, which passes them directly to the WebAuthn client (usually, the browser via navigator.credentials.get(options)). In the case of successful authentication, the client will eventually return an AuthenticatorAssertionResponse which the client-side code directly passes to the RP’s server for verification. This verification usually leads to the client receiving a more short-lived token, e.g., a JWT.
Note that this client-side code can be controlled by the user (who might be an attacker) and can therefore not be trusted. That’s one of the reasons why the server must store the options it sent to the client and verify that the result satisfies the options. If the attacker manipulates the options before sending them to the client, the result should not match the options that the RP stored server-side. If the attacker manipulates the final result, the verification of the authenticator’s signature should fail.
Client and Authenticator responsibilities
When the client receives the request, it first verifies that the website is allowed to make this request. If there are multiple options, it then prompts the user for which authenticator to use and then passes the request on to the authenticator.
The authenticator performs the requested UV and UP checks and returns data about itself and the credential it used (AuthenticatorData) as well as a signature over the AuthenticatorData and the hashed client data (as documented here). Note (for later) that the authenticator’s signature doesn’t guarantee the integrity of any information about how the authenticator is communicating with or attached to the client.
A visualization of the objects that an attacker could manipulate along with their properties. 🚨 marks properties that we will want to manipulate later. 
When the client receives the authenticator’s response, it augments it with some information and client-processed extension results (if any) and restructures it slightly before passing it on to the server.
The website now forwards the response to the RP’s server, which performs the verification described above. The authentication has now succeeded, and the client may be given a short-lived token like a JWT.
Why focus on QR-initiated Cross-Device Authentication?
As listed above, one major feature of passkeys is built-in cross-device authentication (CDA). CDA allows users to use separate devices as their authenticator and client, i.e. they can use keys on one device to sign the other device in. Such authenticators are referred to as roaming authenticators (as opposed to platform authenticators). During CDA, the devices must first be connected with each other (e.g., by scanning a QR code).
The communication between roaming authenticators and clients can happen via various transports (more on this later), which usually require a very user-visible connection between the two devices (e.g., plugging a security key into the client device via USB). However, in the case of QR-initiated CDA, the devices communicate wirelessly over potentially large distances. To a user, scanning the QR-code feels like establishing a direct connection between the client and the roaming authenticator. But is it perhaps possible for an attacker to abuse the invisibility of the wireless connection and trick their victim into connecting their roaming authenticator to the attacker’s device?
To answer this question, we must first understand how the QR-initiated communication between the roaming authenticator and the client actually works.
The Client to Authenticator Protocol (CTAP)
Cross-device authentication requires some way for roaming authenticators to communicate with clients. This communication is specified in the Client to Authenticator Protocol (CTAP). CTAP supports multiple different transports (as of the current version 2.2: USB, Bluetooth Low Energy (BLE), NFC, and hybrid). It defines both the messages passed between the devices (no matter the transport) and transport-specific details (e.g., how handshakes are performed). CTAP is also designed to guarantee physical proximity between the client and the roaming authenticator in an effort to prevent remote attacks like phishing.
QR-initiated hybrid transport
Using BLE as the only transport comes with some issues. First of all, it’s too slow and unreliable for the UX that passkeys aim for. Second, it’s hard to connect the correct devices with each other. How can users ensure their authenticator isn’t connected to an attacker’s client or their client isn’t connected to an attacker’s authenticator?
To solve these issues, CTAP specifies QR-initiated hybrid transport. The client generates a QR code for the authenticator to scan, which specifies how the devices should communicate with each other. It also enables some cryptographic features (more details below).
After scanning the QR code, BLE is only used for the authenticator to send an advert, which the client must then receive. All other communication happens via the network through a tunnel server that the devices negotiate in the QR code and the BLE advert. BLE still provides its proximity guarantee while its unreliability only effects one single packet. After that, the devices can benefit from the speed and reliability of their network connections. They perform a handshake and then exchange non-transport-specific CTAP messages.
Cryptographic guarantees during QR-initiated hybrid transport
The BLE advertisement is encrypted using a key derived from a client-chosen secret in the QR code. This guarantees to the client that the authenticator sending the advert has knowledge of the QR code. The authenticator can then verify that the handshake via the tunnel server comes from the same client that generated the QR code because the handshake uses a public key that was included in the QR code. The handshake also uses a symmetric key derived from the client-chosen secret (which was already used to encrypt the BLE advert) and the BLE advert itself. As a result, the tunnel server cannot decrypt the messages that it passes back and forth, including the handshake (unless it has knowledge of the QR code and the BLE advert).
Evaluation of CTAP’s security
CTAP is designed to protect and improve the communication between the client and the authenticator. While it does a good job at that, CTAP’s QR-initiated hybrid transport doesn’t provide a way for the user to determine that it’s actually their client and their authenticator who are communicating with each other. Users could therefore be tricked into scanning a QR code that was generated by an attacker’s device, thereby granting access to the attacker instead of the user’s device that they scanned the QR code from.
Because CTAP is only meant to specify client-to-authenticator communication, the RP plays no role in the process and can neither influence it nor find out about it. Due to CTAP’s encryption, even the tunnel server is unable to inform an RP of what’s happening (if it even wants to). Hence, if an RP wants to limit certain CTAP features, it’s limited to WebAuthn features. And we have already seen that WebAuthn only provides limited information in this regard (it only distinguishes between platform and roaming authenticators), and it fails to secure that information cryptographically.
Phishing for Passkeys
As we’ve seen, passkeys are resistant to regular phishing because they’re limited to one relying party. But in a spear phishing scenario, QR-initiated CDA enables an attack along the following lines:
0. An attacker places a device within BLE range of the victim and gets them to open a phishing website using spear phishing.
1-2. When the victim tries to sign in to the phishing website, the attacker’s device initiates regular WebAuthn Authentication with the actual RP by getting the authentication options. The attacker’s device, acting as the client in this authentication process, then generates the CTAP QR code based on the options.
3-4. The attacker’s device forwards the QR code to the phishing site, and the user scans it, believing that it was generated by their device.
5. The user’s authenticator connects to the device that generated the QR code via BLE and network. But since the attacker’s device generated the QR code, it’s also who the authenticator connects to, and it receives the authentication response.
6-7. The attacker’s computer forwards the credential to the server and receives the success response.
The threat model
This attack’s target is to spoof the user. The attacker never obtains the credentials themselves, but the attack gives them a session token that they can use to impersonate the user.
In order to perform this attack, an attacker must first determine what users are interesting (i.e., who has access that interests the attacker) and narrow them down to those who have set up an authenticator capable of QR-initiated CDA. An attacker must then find a way to position a device within BLE range of their victim at the time of the attack (e.g., by finding the location of their office and being able to get close to it or by identifying some travel plans). Furthermore, the attacker must be able to spear-phish their victim and to set up a phishing site that visually matches the target site. They must also be able to access the target site (e.g., due to it being public or after exploiting other vulnerabilities) in order to act as the WebAuthn client during the attack. An attacker must thus be highly motivated and skilled.
The attack requires the user to interact with their authenticator, meaning that simply clicking on the link in the phishing email isn’t sufficient. The user must also not recognize the spear phishing, the spoofed authentication UI, or the forced QR-initiated CDA.
Therefore, this attack is relevant for organizations that rely on passkeys for authentication and expect attackers to be able to get within BLE range of employees. For example, a politician who regularly takes a train from their home to their parliament could be spear-phished by an attacker who’s also sitting on the train or planted a device there and left again. The train could even help to disguise some UI quirks (“surely it only looks weird because of the bad network connection“). Another example could be an attacker parked outside a small office of some company while sending out the spear phishing email.
Can we just entirely prohibit QR-initiated Cross-Device Authentication?
While CTAP generally isn’t controlled by the relying party, WebAuthn does allow the RP to constrain what transports may be used, and it requires the client to inform the RP how the authenticator was actually attached (e.g., the RP may allow platform and hybrid transports, and the actually used transport may end up being hybrid). In this case, the RP would be informed that the authenticator was attached cross-platform). However, neither of these properties is cryptographically protected, and they can therefore be manipulated in a spear phishing attack.
An attacker can request the authentication options from the RP, which may include a limitation to exclude hybrid transport. The attacker can just remove the limitation and pass the manipulated options to their WebAuthn client (the attacker-controlled browser). As their authentication flow of choice, the attacker then chooses QR-initiated CDA (which is no longer prohibited) and relays the QR code to their victim via a spear phishing site. The victim’s authenticator will then perform CTAP with the attacker’s client, returning a valid AuthenticatorAssertionResponse to the attacker’s client.
Originally, the response contains the authenticator attachment method that was used, but the attacker can, again, modify it without being detected before sending it to the RP. The RP would therefore believe that no prohibited transport was used and allow the authentication to succeed.
Demonstration of this attack
We implemented this attack in a proof-of-concept where an RP aims to completely prohibit CDA, and the attacker controls a browser to modify the mentioned WebAuthn properties and otherwise lets the browser handle CTAP with the victim’s authenticator. The attacker obtains the CTAP QR code from the browser and displays it on a phishing website for the victim to scan it. The demonstration shows exemplarily that the manipulation is not detected by the security features that WebAuthn and CTAP provide.
Other ways to prevent this attack
While hybrid transport can’t be prohibited by the RP’s server-side code, it can be prohibited in more work-intensive ways. However, verifying that roaming authenticators are connected to the correct client is more difficult.
Limiting authenticators during creation
By strictly limiting credential creation (e.g., by only allowing credentials to be created in-person), one could ensure that credentials are never stored in authenticators capable of hybrid transport. That could be devices without Bluetooth support (e.g., security keys), configurable password managers with the option to disable hybrid transport, or even by using a password manager that categorically doesn’t support hybrid transport.
Note that it is not sufficient to ensure that all credentials are created on platform-attached authenticators. As we discussed earlier, platform authenticators can also act as roaming authenticators, which would again enable the attack discussed above.
As we’ve seen, limiting authenticators may already be necessary for some threat models, since some authenticators don’t correctly implement all security features specified in WebAuthn. In this case, adding extra limitation would come at little additional cost.
Adding more authentication layers
We’ve only discussed how to attack passkeys, assuming they serve as the only layer of authentication. Relying parties could, for example, detect that an unknown device was signed in and prompt the user to confirm the new device. If the user expects their client to be known, this may make them suspicious. RPs could also require additional authentication for critical actions, thereby limiting what attackers could do with their session token.
Updating the specification
Of course, WebAuthn could be updated to include the transport or authenticator attachment in the cryptographically secured properties of the WebAuthn API. While this would be ideal, it is unlikely to be able to serve as a short-term solution.
User-driven approaches
Detecting ordinary phishing sites is hard for users because phishing sites can look completely identical to their original counterparts because both are regular websites with the exact same tools at their disposal. In contrast, WebAuthn UI isn’t controlled by the phishing target website but rather by the client (i.e., the browser or the OS). This means that the UI could be rendered in ways that aren’t available to websites and thus to phishing attackers.
Testing across macOS browsers revealed varying UI implementations: in Google Chrome, the authentication UI partially overlaps the browser UI (i.e., outside of website-controllable space), while Firefox and Safari display it centrally within the browser window (i.e. within website-controllable space). Arc (Chromium-based) shows the UI in the center of the display, potentially even overlapping other applications, depending on the positions of the application windows.
By the nature of this attack, an attacker would need to restrict users to CDA. Users could therefore also be educated to always use other authenticator attachments. However, just like checking that the WebAuthn UI is rendered by the browser instead of the website, this would require all users to be highly alert, and phishers are very experienced at tricking people into letting their guards down for just a brief moment. So while these last two remedies may help, they shouldn’t be the only line of defense.
Conclusion
Since passkeys are based on WebAuthn and CTAP, they come with some very nice security guarantees. They manage to make these guarantees available to the broad public by increasing usability and convenience. And in regard to the broad public, they even manage to offer resistance against remote phishing attacks, which are a type of attack regular people commonly face.
However, in environments where the threat model includes in-person attackers, passkeys are vulnerable to spear phishing attacks. An in-person attacker can trick their victim into performing cross-device authentication with the attacker’s device by spoofing the WebAuthn UI on a spear phishing site. Relying parties cannot prevent or detect such attacks based on just the WebAuthn API. RPs can, however, take more involved steps to completely prohibit QR-initiated authentication or to add more layers of authentication.
This article would not exist without the initial idea and guidance of my colleague Clemens Hübner, as well as very helpful feedback from numerous other colleagues. I’m very grateful for all of your help!