Google’s FLoC vs. Topics – Old Wine in New Bottles?
on 22.02.2022 by Stefan Riegler
A few weeks ago, Google surprised the advertising world with its announcement that it would stop working on its “Federated Learning of Cohorts” technology, FLoC for short, and instead focus on “Topics API” from now on. With Topics, as before with FLoC, Google is trying to develop an alternative for the third-party cookie. Google is planning on banning these from its Chrome browser in the foreseeable future, in order to catch up with its competitors Apple (Safari) and Mozilla (Firefox), which have already taken this step. At times, Topics is talked about as if it were just a copy of the old FLoC with minimal changes – old wine in new bottles, so to speak. But do these comparisons do justice to the new system?
In this article we will explain how FLoC works and why it failed, and then take a closer look at the new Topics API. How does it work? How does it differ from FLoC? And what does the future of Topics look like?
What was Federated Learning of Cohorts (FLoC)?
“Federated Learning of Cohorts” was the first attempt at an alternative to the third-party cookie, which Google plans to ban from its Chrome browser by 2023. FLoC was first announced by Google in 2019 as a privacy-friendly technology to implement interest-based advertising. Initial tests with Chrome users were launched in March 2021. Now, at the end of January 2022, Google officially discontinued the development of FLoC. In order to comprehend the criticism of FLoC and the reasons for its discontinuation, and to understand the differences to the new “Topics”, it is worth taking a look at how the system worked.
How FLoC worked
Google’s web.dev blog explains in great detail how FLoC works. The basic idea behind FLoC is to assign the browsing history of an Internet user to a cohort of similar histories. This assignment takes place within the browser on the end device of the respective user. For this purpose, the browser evaluates the page history, checks into which cohorts it best fits and assigns a corresponding identifier, the so-called FLoC-ID. This calculation would be done weekly. This would allow users to move from one cohort to another if, for example, their surfing behavior changes over time. According to a study by the Electronic Frontier Foundation, an NGO for fundamental rights in the information age, more than 33,000 different cohorts would be possible with FLoC.
On their sites, both advertisers and ad providers can then observe and analyze the behaviors of different cohorts based on the FLoC-IDs. Adtech platforms now bring advertisers and advertising providers together: Advertisers can target cohorts that are attractive to them and have ads played out to corresponding users via the advertising providers’ sites.
Reasons for the end of FLoC
- As part of the testing phase starting in March 2021, FLoC was rolled out to 0.5% of Chrome users in the US, Australia, Brazil, Canada, India, Indonesia, Japan, Mexico, New Zealand, and the Philippines. However, FLoC was never deployed within the EU or the UK. One reason, according to AdExchanger, was concerns on Google’s part about compliance with the GDPR and e-Privacy Regulation. In particular, with FLoC, it was questionable which parties were acting as controllers and processors here in terms of the GDPR. Google? The users on whose devices the FLoC-ID is calculated? In this context, another question arises: The FLoC-ID alone may not be considered as personally identifiable information. However, the user’s browser history is used to calculate the FLoC ID – in other words, personal data. According to article 6 GDPR, this would require a legal basis for the processing, i.e. the consent of the user or a legitimate interest. How exactly would Google have solved this?
- But these unresolved legal issues are probably not the only ones that have doomed FLoC. The alternative to the cookie, which was touted as privacy-friendly, was also repeatedly confronted with criticism from privacy activists and organizations last year. One of the main arguments, including that of the aforementioned Electronic Frontier Foundation (EFF), was that FLoC enables the targeting of vulnerable groups, as well as misuse for discriminatory and politically manipulative purposes. The FloC ID alone does not reveal the characteristics or behaviors that connect the members of a cohort – the ID is merely a number. However, by means of appropriate analysis processes, companies could still investigate which cohorts are linked to sensitive information (e.g. ethnicity, existing diseases or sexuality). Google’s suggestion for improvement to leave out certain sensitive pages in the FLoC calculation would also not change anything about the basic problem, according to the EFF. In addition, the EFF accused Google of facilitating tracking via browser fingerprinting with FLoC. With fingerprinting, information from the user’s browser and device (e.g. IP address, screen resolution, time zone, used fonts, language settings, etc.) is read and combined to make the user uniquely identifiable from millions of other users. The FLoC-ID would give trackers a head start in fingerprinting, so that they would only have to distinguish the user from a few thousand other users.
- This criticism may also be a reason, why numerous browser providers, including Microsoft (MS Edge), Mozilla (Firefox), and Apple (Safari), have signaled their opposition to a FLoC integration within their products. In light of a growing privacy movement and public debate in recent years, respect for the privacy of users has become an important differentiator in the battle for market share in the browser market. Considering that Google’s in-house Chrome browser alone has a global market share of over 60%, a rejection of FLoC from the competition may not seem dramatic at first. But this was compounded by the fact that WordPress, in April last year, declared that it wanted to treat FLoC like a security issue and block it by default on its websites via patch. That would be problematic for Google, as WordPress operates over 40% of all websites on the Internet. The FLoC system would therefore not only miss people who use FLoC-free browsers, the cohort assignment itself, on which FLoC relies, would become even more inaccurate as many websites wouldn’t be included in the analysis of the browser history.
- The advertising industry was also skeptical about FLoC, although this attitude was presumably less the result of privacy concerns than of fears of the end of third-party cookies – and thus of even greater dependence on Google and other big players. In view of Google’s announcement that it wants to block third-party cookies in the Chrome browser, the industry refers to these plans as “Cookiegeddon” or the “Cookie Apocalypse”. Google currently plans to take this step in 2023 – but at the same time assures that it will not do so until reliable alternatives for cookie-based tracking are available. For many companies in the advertising industry, whose business models rely on tracking, this is no reason to relax. The Central Council of the German Advertising Industry (ZAW) accuses Google of abusing its gatekeeper role in the browser market to distort free competition in the online advertising markets in its favor.
The end of the story is known. Google announced at the end of January 2022 that it would stop developing FLoC and instead focus entirely on Topics.
Topics API: FLoC is dead, long live FLoC?
With the phasing out of FLoC, Google announced that it would now focus on Topics API as a new tracking alternative for the post-cookie era. Topics are, as the name already suggests, topics that users are or might be interested in based on their browser history. A file available on github shows the first version of a Topics taxonomy with 349 entries – including, for example, topics such as “Rap & Hip-Hop,” “Fishing,” “Weather” or “Low Cost & Last Minute Travel.” But wait: “Based on browser history”? That looks suspiciously familiar from FLoC. Is Topics API just new naming for a known procedure?
How Topics works
In the associated blog post and on Github, Google provides extensive explanations of their new approach to a cookieless advertising world. The promise is more transparency and control. Will Google be able to deliver? For a better understanding of Topics, let’s first look at how it works in detail below:
- When a user visits a website, callers (e.g., third-party ad tech providers or advertisers on the visited page) can retrieve up to three Topics from him or her – one topic for each of the last three weeks.
- Web pages are assigned to specific topics based on their hostname. For example, www.tennis.com could be assigned to the “Sports/Tennis” topic. A page can also be assigned to one or more topics.
- The user’s page history is evaluated within the browser on a weekly basis and based on this the top five topics are selected and additionally a sixth random topic is determined from the list. From these five topics per week, a top topic is now randomly determined, with a 5% chance that the random sixth topic will be put out.
- The caller now receives up to three top topics (one per week) in random order.
- For users with opt-out, in incognito mode or with deleted browser history no topics are played.
- Users will, according to Google, be able to view their topics and delete “wrong topics” or those for which they do not want to see ads.
- For the same caller, the topics of the individual users remain the same over a period of three weeks, no matter how often the caller queries the topics – this is to prevent the caller from finding out additional topics of a user over several queries and thus getting a more detailed picture of him or her.
- Furthermore, not every caller can query topics. Only callers who have observed the user with a topic on one of their pages within the last three weeks can have the topic played out again when queried (on another page). To put it simply: I, as a broker of advertising space, only receive topics if I am also integrated on the pages that the user has visited – I cannot pick up topics from pages that do not use my service. For ad tech companies, this creates an incentive for integration on as many websites as possible, in order to be able to query the topics of as many users as possible. However, this regulation also means that the site operator itself cannot query the topics of its users unless it is part of a network with other operators or the ad partners feed this information back to it.
- Advertisers can now direct their targeting to topics that are relevant to them. As a manufacturer of tennis rackets, for example, I could have my ads played out to users with the topic “Tennis”.
The rest would probably be “business as usual”: Supply-side platforms and demand-side platforms bring together suppliers and buyers of ad space, and advertisements are sold to the highest bidder in an auction process.
The solution to FLoC’s problems?
It quickly gets clear that although FLoC and Topics are both based on users’ browsing histories, the two systems function quite differently. While FLoC sorts people into a numbered drawer (cohort), Topics allows advertisers to obtain a limited set of information about user interests. But does this solve all the problems and open questions that have arisen with FLoC?
For data protection activists and organizations, the fact that users can influence the image they convey in Topics, for example by deleting individual topics or opting out of the process altogether, is likely to represent progress. Furthermore, Topics will make it more difficult to target vulnerable groups or to use targeting for abusive (political) purposes.
Despite all this, Topics will probably not be able to completely prevent such targeting. The concrete possibilities for this will depend on the further development of the taxonomy. However, with the current state of development, children, for example, could be disproportionately well reached by combined targeting on topics such as “Children’s Literature” and “Animated Films”. Targeting for political purposes would also be conceivable. A combination of the topics “Hunting and Shooting,” “Country Music” and “News/Politics” would probably reach a disproportionately large number of Republicans in America compared to Democrats.
Admittedly, these are only gut guesses on our part. With the appropriate tools and analyses, however, it will also be possible with topics to find out which (vulnerable) target groups correspond disproportionately with which topic combinations.
With regard to the GDPR compliance of the new method, as with FLoC, the question remains as to who is considered the controller and/or processor in the sense of the regulation. The topics themselves may not constitute personally identifiable information, since they alone cannot be used to identify individuals – however, as was the case with FLoC, the browser history and thus personal data will be used to calculate the topics. Here, too, questions will arise about the legal basis for processing, i.e., consent or legitimate interest.
Even if we assume that Topics would be compliant with the GDPR and a legitimate interest can be asserted, this should not mean the end of content banners as we know them today in Germany. This is because in order to query topics, information is retrieved from the browsers of the users, i.e. their end devices, for which the consent of the users is required according to § 25 (1) TTDSG. Providers of consent management platforms and consulting agencies in consent management can therefore breathe a sigh of relief.
The future of Topics: A lot of open questions
Google wants to launch the first tests for developers in the coming weeks. It remains uncertain if, when and how the future of Topics in the EU will look like. As outlined above, Topics also raises some questions about conformity with European data protection law, in particular with the GDPR.
In addition, it is questionable how the taxonomy will develop in the future. Which new topics will be added, and which will have to go? Who has the decision-making authority? And to what extent will the aforementioned problems surrounding abusive targeting be taken into account in the further development?
Furthermore, it is still unclear how well and reliably the topic assignment will work if it is to be based solely on the host name. What will be changed in the system to avoid bad or incorrect topic assignments? After all, www.spiegel.de, for example, would be best placed in one of the news topics and not in the topic “Home & Interior Decor” on the basis of its host name. Will website operators be able to assign topics to their pages themselves?
In the end, it remains to be seen how the various stakeholders, i.e. browser providers, website operators, advertisers, advertising providers, data protection organizations and, last but not least, Internet users, will position themselves towards Topics. If the resistance is as strong as with FLoC, it remains to be seen whether we will actually say goodbye to third-party cookies from Chrome in 2023 or whether Google will have to find a new approach for the post-cookie era. It remains exciting!