Data Clean Rooms – The new era of data collaboration


Within the last decade, the data privacy and security landscape has undergone a true revolution. The introduction of GDPR, the European Data Act, and the Data Governance Act has all put strict limitations on data sharing and usage. In a world where most decisions are data-driven, how can you reconcile both of these worlds? Luckily, there’s a safe and legally-compliant workaround called Data Clean Rooms.

What are Data Clean Rooms?

Data clean rooms (DCR) are a safe interoperational space where two (or more) parties can exchange data without revealing sensitive information. This is enabled through data pseudonymization, where personal data is given a “pseudonym” to protect it from decoding by an unauthorized party.

This is as important for financial and healthcare organizations to protect Personally Identifiable Information (PII), as it is for commercial data use between, say, “Coca-Cola”, and “PepsiCo”. The emphasis is placed more on the power of collective insights rather than it is on using data on an individual basis. Some of the possibilities they pose include uncovering additional market potential, such as discovering new prospective clients.

Everybody benefits when data is widely available but remains under the control of the owners—which is why DCRs are the most rapidly emerging and hottest trend regarding data today.

While originally intended for healthcare and financial services due to obligatory compliance requirements, DCRs have expanded to include just about everyone. In fact, according to the Interactive Advertising Bureau (IAB), 80% of all companies are already considering or using DCRs.

While it isn’t “automatic” and requires data infrastructure renewal, time, technology, and cash, the open system provides flexibility to new entrants in the field.

Before we dive into the benefits of DCRs and why they’re the future of compliant data usage, let’s first take a look at how these environments came to life.

Data Clean Rooms History & Background

When looking at the rise of DCRs, it’s important to look at the wider data privacy and usage history. There are two significant developments to be aware of – cookies and walled gardens.

The rise of cookies

Cookies were first created back in 1994, as part of the once-leading web browser called Netscape. Lou Montulli, a programmer, created a way to record user data in a text file he called a “cookie”, effectively creating a memory for the internet. Consequently, you didn’t need to start from scratch every time you logged on.

That browser cookie could remember what was in your shopping cart, what you had looked at before so it could provide links to get back to that content, and it could remember your username and password. That was an important first step to making the internet truly useful.

It was inevitable that someone would realize the potential of that text file of visitor information. Compiling that data could help them sell more products to their users. Selling that data became a revenue stream. Still, what many online corporations ultimately focused their monetization efforts on was not how they could sell the data directly. It was more about how they could aggregate data and use its wealth to build an advantage over their competition. In order to store the collected data safely, they need to create a safe, closed environment no one outside the organization had access to.

Cookies were first created back in 1994, as part of the once-leading web browser called Netscape.

Source: Unsplash

The birth of walled gardens

The first type of DCRs to emerge were called “walled gardens”. The term “wall” is no coincidence – they’re restrictive, inflexible, and all the data generated is only used internally. It presents a “black box” to the outside world. With data confined within such a clean room, cooperation between partners simply cannot occur.

In its walled garden format, DCRs provided (and continue to offer) the opportunity for the advertising sector to create environments to pool advertiser data (but not customer-level data). This increases effectiveness—but it is restrictive unless you have virtually unlimited data from a massive customer base. Google had that immense volume of data so they did it first, farming its vast client data to enormous effect. Others of similar stature followed suit and became dominant in the data market.

Now technology is allowing everyone else to catch up. While walled gardens from companies like Amazon, Facebook, and Google (AF&G) may have a seemingly insurmountable lead, there is amazing power in cooperation, and there is room for others to compete. It is also much easier for people to criticize those at the top while rooting for an underdog to succeed.

Corporate inertia is working in your favor, too. AF&G see no reason to change their working and profitable strategies. Smaller outfits have flexibility, adapting quickly to current and future problems. The genuine competition keeps them (and rivals) striving; the competition educates them to be creative rather than simply programming more of the same.

A new data privacy landscape

With all of the above in mind, the incredible commercial potential of data also came with a darker side – the abuse of sensitive data.

We will discuss the exact legal acts in more detail in the next section, but here is a quick overview:

  • In 2018 the User Privacy Era began in earnest. We saw the introduction of the GDPR or General Data Protection Regulation, as well as opt-out Intelligent Tracking Protection from Apple.
  • 2019 saw Amazon introduce the Amazon Marketing Cloud, an early DCR, where collected data could be shared, but Personally Identifiable Information was preserved for clients.
  • In that same year, Apple provided the new ATT Prompt, removing the need to opt-out of data gathering. Instead, you were automatically excluded from data gathering by websites and advertisers, and you instead had to opt-in, which 96% of users elected NOT to do. That was a massive blow for advertisers, unable to track between apps now.

It becomes harder and harder to gather data while staying on top of the latest global legislation. For instance, in 2021, Meta decided to stop sending user-level campaign data to commercial buyers, but rather to MMPs (Mobile Measurement Partners) to preserve PII and privacy. It also expects more organizations to join soon.

It becomes harder and harder to gather data while staying on top of the latest global legislation.

Source: Unsplash

DCRs as the Best Solution for Compliant Data Analysis

DCRs are not the exclusive domain of advertisers, of course. Any company, any industry, and any service can benefit from them. There are innumerable use cases for DCRs and they will comprise the future open ecosystem used by most.

Open Ecosystem Data Clean Rooms

Also called “Partner Clean Rooms”, the Open Ecosystem Clean Rooms (OECR), have numerous advantages over the closed variety. They control what information enters, how it can be joined with other data, the permissible analytics that parties can perform on the data, and which data can be removed by non-owners.

All PII is encrypted and secured, controlled by the OECR operator. Approved member partners can obtain a data feed with the anonymized data.

Why are data clean rooms gaining momentum?

We’ve already discussed the reasons why DCRs came to life. But what indicates that they’ve got a bright future ahead of them? Here are some of the key factors.

Keeping up with the security, compliance, and privacy laws landscape

Personally Identifiable Information, as defined in the United States, means name, address, Social Security Number, or any identifying number or code to directly identify an individual. The EU’s “Personal Data” is somewhat more extensive, including all PII, and much more.

A DCR provides optimum security for protecting PII but the GDPR requires protection beyond that. It encompasses even indirect attributes and identifiers specifying “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

That’s a lot to take in. It boils down to the fact that the GDPR makes it illegal to identify any individual, beyond a pseudonymized identifier. Pseudonymization is actually a Statutory requirement, effectively preventing the re-identification of an individual without separately stored “keys” held by the data controller/designee.

All this means that DCR users have to pre-identify which technical controls they will need to meet those requirements before they are up and running. Statutory Pseudonymization is the easiest answer.

Deprecation of Third-Party Cookies & Mobile IDs

In 2023 Google announced that it planned to stop using 3rd party cookies in Chrome by 2024, joining a mounting list of browsers like Firefox which have long since scrapped the infamous tracking technology. This Google Cookie deprecation has left companies and advertisers scrambling to figure out how they will access data in a way that meets their needs.

Recent studies have shown that the GDPR has reduced opt-in sharing of their data by nearly 11%. This has significantly reduced the quantity of data available to businesses. Combined with Apple’s ATT framework, Facebook restricting user-level data, and Google killing 3rd party cookies, there is not much left for advertisers to work with.

Recent studies have shown that the GDPR has reduced opt-in sharing of their data by nearly 11%.

Source: Unsplash

As a result of the changes, most companies will have to resort to 1st party data rather than 3rd party data. Now they’ll be obliged to provide quality over quantity to their considerably smaller audiences. It’s somewhat akin to personally presenting a fine meal rather than inviting hundreds of patrons into a buffet restaurant.

AF&G will continue with their “walled gardens”, of course. Their data clean rooms continue serving commercial users (albeit at a high price). For everyone else, DCRs created and built amongst ourselves, and specifically not beholden to AF&G, are the best solution.

Consumer Privacy Expectations on the Rise

In the last half dozen years people have become acutely aware that most businesses know a lot more about them than they ever imagined. Not for the first time, two universities in Europe have shown that they can “de-anonymize” data with only 15 demographic attributes in 99.98% of cases studied by using publically available data.
Keeping your data out of their hands by storing it in a controlled environment makes it much safer. Your customers will have more confidence in your business if they know you use DCRs to manage your data about them.

Companies have made little effort to boost consumer confidence in their data handling since the introduction of GDPR or numerous others, like the California Consumer Privacy Act (CCPA) and so on. DCRs encourage data transparency, showing responsibility with the entrusted data, increasing consumer confidence, and promoting a true exchange of value between the brand and the client. If clients decide to provide more data, their experience should improve, providing increased value for the precious information they are providing to the company.

To accomplish this companies must establish that they are using the very apex of privacy procedures during processing or sharing. They must show that data is encrypted, pseudonymized, and cannot be decoded by anyone except the company itself.

With the number of unique cookies decreasing by nearly 13% in the two years following GDPR introduction, users are taking back control of their PII. It is either through opting-out of cookie-use, using cookie-blockers, or switching to more robust and secure browsers that automatically stop 3rd party cookies from being placed on a computer in the first place.

The result is that users are vanishing from the data sets as they employ these methods. They no longer have their former extensive histories that businesses could draw upon. Instead, they look like new users every time they log on, even from multiple tabs in the same session. Providers need to improve their game, providing real value. This will encourage customers to return to their previous levels of ease in dealing with them.

Collection & Data Sharing Continues Evolving

Upset or outrage at the casual data protection practices employed by businesses gave rise to 79% of U.S. consumers demanding better safeguarding techniques, and governments passing more demanding regulations for businesses. The EU has had greater protection since 2018’s introduction of the GDPA. The CCPA covers much of the same territory, supplemented by 2020’s introduction of the California Privacy Rights Act (CPRA), expanding upon the original act.

Splintering and scattering of data requires unification to rebuild complete data views to make them useful again. Currently, data is siloed and disparate. DCRs offer the only really strong way forward so that data can be consolidated and activated for proper use.

Smaller Data Owners can Monetize Data

AF&G have been the prime beneficiaries of data for years now. Walled gardens they may be, but all have been found in violation of Privacy Objectives on many occasions. META (Facebook) paid a €390 million fine for “overreaching” during targeted ads in 2022; Amazon paid out $877 million or €746 million in 2021; as well as Google, Equifax, Instagram, T-Mobile, and numerous others.

Smaller businesses can profit from the heightened safety of DCR landscapes. Obtaining permissions from consumers to manipulate their data in a responsible way, and for collaborating with other data owners, preserving PII from the 2nd parties—this increases trust and brand loyalty. Showing that the information is protected, encrypted, shared discreetly (separating information from identity), and pseudonymized reassures them and meets the legal requirements of protecting those consumers, too.

DCR owners and users need to inform their customers that such data is safe within the DCR, and that technical controls travel with the data during movement or utilization by delinking all direct or indirect identifiers. Additional computational techniques are employed to enhance privacy that specifically prevents the data from being recombined with other sources in an attempt to de-anonymize it and re-identify the original data subject.

This trust allows such a DCR to connect with many other data sources. If you don’t have all the pieces of the puzzle, the answer is out there somewhere. What an SME cannot do alone, it can accomplish through cooperation with others.

The Future of Data Clean Rooms

People’s expectations have changed—users want a better Customer Experience (CX). However, you can’t do a better job without understanding your customers, and that takes data to develop those insights.

Increased regulation escalates the difficulty since you can’t do “anything you want” with the data. The data-neutral environs of a DCR are privacy-safe to meet obligations to your clients but allow 1st and 2nd party data to be combined and evolved to find those niches you can expand into. Now you can have your cake and eat it too, since DCRs allow you to have both personalization and pseudonymization!

The year 2022 saw significant advancements in implementing this technology more broadly and effectively. Adoption should also take a big leap forward as the effects precipitate and become apparent to most.

While Amazon begins to provide AWS Clean Rooms, companies like Snowflake and Databricks are making a move into the market. This significantly lowers the entry barriers to testing for smaller organizations that can now utilize DCRs without operating them or being responsible for their administration and upkeep.

What that means is that more data-driven applications will arise all across industries, powered by data flows from DCRs. You’ll see ventures across the board, probably led by advertisers, then healthcare, and finally agencies, media outlets, and all the rest, as momentum builds.

Seeing AWS having early success, Google is sure to follow, slowly and ponderously as they do. Like an ocean-going tanker, they’ll take a long time to get up to speed. Meanwhile all the smaller vessels have a chance to get away out in front and establish a presence. That can be you.

Is Your Business Ready for DCR?

Commercial preparedness

C-Suite members and other leaders need to understand why their organizations need to engage with DCRs. The fundamental business motivation for a combined data partnership informs the nature of the commercial model. It defines the most appropriate features for the clean room and which technologies to implement.

Customer preparedness

Data owners should be aware of the available technologies, processes, people, and procurement ability of current and future customers. They need to look for offerings that are practical and serviceable for a diversity of users based on extant standards, technologies, and procurement procedures.

Technology compatibility

DCRs enable a high degree of interactivity through data sharing and technology. They require both parties to have well-matched and commercially favorable technology.
Deciding which features are needed to enable collaboration with prospective data partners is important. DCRs require parties to agree on numerous areas, such as which cloud platform, data technologies, and commercial models best suited for the task, while allowing all parties to meet on common ground.

Additional Considerations

  • Does DCR implementation impact your cybersecurity risk? How will you assess and monitor security?
  • Which privacy and compliance regulations apply to your data? How will you manage access?
  • You’ll generally require consent to share data from customers and owners. How will it be obtained and managed to allow collaboration?
  • There are Regulatory frameworks and legal requirements applicable to your types of collaboration. How will you meet them?
  • There are no one-size-fits-all DCRs. How can you meet the needs of users in a DCR so that you are competitive with other offerings?

There are many possibilities for implementation. You may have multiple DCRs to service different needs.

Choose TrustedTwin as your Data Clean Room

This is the strategy that we follow at Trusted Twin:

Cooperating partners agree to exchange data via Trusted Twin in a secure form (pseudonymization). They agree on certain identifiers, encrypt them, and then upload it to the platform together with other encrypted data.

The second partner consults with the first partner regarding a given identifier. If it is present, they obtain specific information. Entities can draw on the partner’s data without ever seeing it, and vice versa. There is no exchange of PII.

This solution covers both definitions of personal data—those in the US (PII) and in the EU. Clearly, Trusted Twin is your best choice for DCR implementation… So let’s get to work today!

Let's discuss how Trusted Twin can enable data collaboration for your business.