Today the IAB Tech Lab is publishing version 1.0 of the Open Private Join and Activation (OPJA) clean room interoperability standard. Throughout the past year, together with a growing number of industry collaborators and members of the Tech Lab’s Privacy Enhancing Technologies (PETs) and Rearc Addressability working groups, our team played a leading role in developing OPJA with the goal of enabling interoperable privacy safe ad activation based on PII data.
Beyond our work on the initial proposal, we have several broader goals with OPJA:
- We aim to define an open and standard set of requirements for a type of clean room operation that enables an advertiser and a publisher to match sensitive datasets containing user PII, such as email addresses or phone numbers, while limiting information exchange between parties as much as possible.
- We want to develop and promote the adoption of standard mechanisms in OpenRTB that enable ad targeting of OPJA-matched user ad impressions, using any compatible SSP or DSP.
- We want to provide open reference implementations that enable OPJA while adhering to the stated requirements.
- We want to support both OPJA’s encrypted labels as a way of securely activating matched audiences from Optable, as well as interoperate with other vendors based on OPJA’s secure matching mechanisms.
While we think that there is room for clean room vendors and collaboration platforms to offer their own proprietary spin on the activation use case (many already do), we’re hoping that they will make an effort to evaluate and align their implementations to better adhere to OPJA, and we intend to make it easy for them to do so.
In order to achieve our goals, agreeing on an independently trustable manner in which user data can be matched and activated in the multi clean room vendor setting was imperative.
Doing this work in the open is essential, as it ensures that it is widely accessible and that any vendor can contribute ideas and review the proposed protocols and technologies. Open-source promotes transparency, collaboration, and inclusiveness in the development process. We believe that providing a common foundation that anyone can access, modify, and contribute to is essential to achieving interoperability between all vendors, instead of a select few.
We decided to focus our initial interoperability standards efforts on the activation use case not only because it is a frequently encountered use case in industry, but also because we have noticed confusion regarding the extent to which user information is exchanged between parties that enable the use case in proprietary ways today.
On the surface, activation of overlapping audiences matched using a clean room is straightforward. Consider the case of an advertiser with a list of customers that wants to display ads to those customers when they are interacting with a publisher’s websites or applications. If users have provided personally identifying information, such as their email address, to both the publisher and advertiser directly, then the advertiser and publisher can compare datasets in a clean room in order to construct an audience of overlapping users. Here’s a Venn diagram illustrating the operation:
While seemingly simple on the surface, when it comes to the sharing of information associated with individual users, there are several subtle but material differences that may arise when such an operation is performed in practice. Notably, what new user information could the advertiser and publisher parties learn as a result of performing the match and targeting operation? Will the advertiser be able to track which of its individual customers are also browsing the publisher’s websites? And will the publisher learn which of its registered users are also the advertiser’s customers?
To answer such questions, a standard set of security and privacy design goals, input and output requirements, and clear documentation regarding the extent to which private user information is exchanged between parties when enabling the ad activation use case were all elaborated and made part of the OPJA specification. Ultimately, our goal with OPJA is to enable ad targeting on overlapping users without the parties leaking user information to each other. This is not only good for end user privacy, but it also prevents data sharing that could be exploited by competitors.
Raising the Privacy Bar
A defining characteristic of clean rooms is their potential to limit the scope of the processing of user data controlled by multiple parties. A simple example of this in practice is the construction of an aggregate report describing the intersection of two audiences originating from separate parties. In such a report, the joining, grouping, aggregation, and statistical noise injection can all be performed in a data clean room, thus preventing either party from learning anything about the other party’s data, other than what is included in the prescribed report.
This limiting capability of data clean rooms is inherent in the activation matching operation prescribed by the OPJA specification. In OPJA, a secure match is performed in order to determine which individual users are in the intersection of audiences originating from an advertiser and a publisher. Rather than the list of matched users being shared with either party, the presence or absence of each user in the intersection is encoded in the form of a label and is then encrypted. These encrypted user labels are shared with the publisher who cannot decrypt them, but who is able to insert them into ad requests. Ad requests are processed by ad tech (SSPs and DSPs), and only the advertiser’s designated DSP can decrypt corresponding match labels, enabling the DSP to make decisions on whether and how much to bid for the opportunity to show an ad. Critically, PII such as email address or phone number are never shared or transferred in ad requests, or outside of the match operation.
Equally important is that thanks to label encryption, OPJA allows the hiding of information about which individual users are in the audience intersection from both the advertiser and the publisher. This reduces data leakage between advertisers and publishers, and enables remarketing without requiring user tracking. Fundamentally, it’s an approach that adheres to the data minimization and purpose limitation principles of privacy by design.
Privacy Enhancing Technologies
OPJA outlines two approaches enabling the matching of user PII data in the multi-vendor setting, and they’re both based on Privacy Enhancing Technologies (PETs). The first is a purely software based, delegated private set intersection. This method enables the comparison of encrypted datasets using commutative encryption, without decrypting the data. The delegated helper server cannot decrypt the match data and is used merely to execute data comparison and generate encrypted data for activation. Additional trust in the helper server could be provided through hardware provided remote attestation.
The second approach is based on hardware provided Trusted Execution Environments (TEEs). This method ensures that match data is encrypted exclusively for the secure processing hardware provided by a helper server.
The use of PETs offers a robust foundation from which trust between vendors regarding how user data is matched can be achieved. OPJA matching requires that the data remains protected with encryption during processing, through a combination of cryptography software and TEE hardware. This greatly reduces the number of things that vendors and service providers need to trust each other with.
OPJA’s matching approaches are also not theoretically limited to a single cloud or infrastructure environment. These characteristics make PETs based approaches great as matching interoperability candidates in the multi-vendor setting.
For a fun introduction to OPJA, check out Digiday’s excellent WTF is IAB Tech Lab’s Open Private Join and Activation?
For a simple walkthrough on how commutative encryption can be used to enable double blind matching (not specific to OPJA), have a look at the little explainer here.
Finally, it’s our hope that OPJA is a catalyst for future open proposals associated with measurement, audience modelling, and other use cases that involve the sharing of sensitive user data between advertisers and publishers.