SameSite Cookies, Chrome 80, Privacy Sandbox – What's What?
Why Google is rushing to bury cookies in the past, and what it will result in for everyone else.
Since 2016, Chromium has supported the SameSite cookie attribute, which, as the name implies, limits the access to cookies to same site requests. The attribute has three values that enable sending cookies:
- Strict – exclusively within the website they were initially set on;
- Lax – within the website and when the user opens the URL by following a link (top-level navigations which use a safe HTTP method);
- None – along with any cross-site requests.
In theory, SameSite helps protect data from unreasonable tracking and cross-site request forgery. But if the website developers have not set the attribute, the server automatically sets SameSite = None and treats cookies as third-party ones, sending them left and right freely. Spoiler: according to Google, as of March 2019, only 0.1% of cookies are marked as SameSite.
Therefore, in May 2019, Google and Microsoft published the “Incrementally Better Cookies” draft, where they proposed operating all cookies as first-party cookies by default. As an experiment, these settings were introduced for a small number of Chrome users back in October 2019. And in February 2020, Google officially launched new settings in the Chrome 80 Stable. Now you can send cookies to third-party domains only through HTTPS after changing the attribute of the necessary cookies from the default SameSite = Lax to SameSite = None.
So far, new settings have been launched to a fraction of Chrome 80 Stable users, and their proportion has been increasing gradually since. With Chrome 81-82 Canary/Dev/Beta release, the SameSite behaviour will be extended to 50% of their users, and that proportion will be gradually increased as well.
Microsoft Edge and Mozilla Firefox will roll out similar settings.
If everyone all together moves over to HTTPS and sets the appropriate cookie attributes, then nothing will change for websites, users, and advertisers. The Internet will work as before; it will just become a little safer.
The fun part begins with the next step towards better cookies, as in Google’s view, good cookies are dead cookies. Google expects to get rid of third-party cookies in 2 years and suggests HTTP State Tokens and a set of APIs instead. This initiative has been called Privacy Sandbox.
The Internet relies on cookies. Authentication; browser history; media element (built-in player, banner, repost button) views; metrics/measurements/tracking; webpage adjustment to device, browser, and user settings – all of them require cookies to store and transfer relevant data from the server to the browser and back.
However, the versatility of cookies becomes a con in a heartbeat, as it enables uncontrolled user tracking.
Instead, Google suggests using HTTP State Tokens. The process, in theory, looks as follows: the browser sends the encrypted token to the website server through the http header – only one token per origin and only through a secure channel.
The token value is controlled by the client, not the server, and looks like a random set of bytes of a fixed length. Accordingly, the server does not know anything about the user but can sign verified tokens with its encrypted key and return additional attributes such as token creation timestamp, delivery context (same origin, same site [default], cross-site), token’s lifetime [1 hour by default], and value.
Since websites are not able to track users without user data (let us omit fingerprinting and other workarounds), the principles and mechanisms of monetisation based on online advertising will have to be revised entirely. User targeting is likely to be gone for good.
Google proposes several substitute APIs to tackle some issues:
- Anti-fraud: through the Trust Token API, the website issues non-personalised encrypted tokens to trusted users, which can be “spent” on other websites.
- Ad conversion measurement: the current version does not support impressions counts; only aggregated click data are available. The browser generates conversion reports based on redirects and sends them in bulk several times a day on a schedule, with noise applied. These reports include metadata on the impression, the conversion, and the credit this impression received for the given conversion (from 0 to 100).
- Aggregated reporting: API determines the number and frequency of unique views of an ad campaign across multiple domains.
- Interest-based targeting: Federated Learning of Cohorts (FLoC). Based on browsing history, the browser’s ML algorithms group users into broad cohorts of similar interests and submit this information into Client Hints (which will replace the User Agent string).
- Retargeting: Two Uncorrelated Requests, Then Locally-Executed Decision On Victory (TURTLEDOVE). Advertisers form interest groups – lists of users for retargeting. The browser regularly requests ads for a particular group and caches creatives in advance. When a user opens a website, the browser requests a contextual ad and then runs an auction between a contextual ad and pre-loaded interest-based ad.
The overall tendency to transfer control of user data to the user is undoubtedly righteous. The growing cryptography usage is also nothing but a positive trend. At the same time, “cookielessness” will only prevent tracking by cookies, whereas there are still quite a few ways to track down a unique user, over which they have no control whatsoever.
It is online advertising that faces major challenges. Following the data, essential functionality transitions to the user (read – browser) side, and nobody knows how to make this work. Google’s proposed APIs by no means cover all use cases and are still quite crude.