The attacker in this mechanism has a 50% chance of correctly guessing an output a priori: an answer to a yes or no question is either yes or no. This knowledge is represented as the initial suspicion found along the x axis, at the 0.5 mark. Tracing this initial suspicion value vertically to the y axis will end at the thick black diagonal line, which represents what may be thought of as the home base position (i.e., the attacker did not learn anything from initial to updated suspicion). Using Figure 4’s Algorithm 2, we found that the attacker’s guess may be adjusted by at most 25% for a highest-possible confidence of 75%. This is represented by vertically adding 25% to that diagonal line, ending at 75% along the y axis. This point on the y axis is the a posteriori confidence, belief in the correctness of an output after seeing the mechanism’s output.
The final value here is known as the upper bound—an adversary can gain no more confidence when witnessing a mechanism’s output than this percentage (i.e., that a provided answer by a mechanism is the real, truthful answer). The lower bound moves the confidence in the opposite direction and represents the best-case scenario for a respondent. Based on an observed output of a mechanism (i.e., a “no” answer in randomized response), the attacker may lose confidence in a guess (e.g., you thought there was a 50% chance of something happening, but when seeing a particular output value, your confidence drops to 25%). Another way to think of these two boundaries is that not all answers are equal, some answers may be more likely than others, and therefore an attacker’s confidence may change depending on the observed output. This change occurs because of how the mechanism is built.
To practicalize this dance of probabilities, imagine you owned a crystal ball which tells you whether it will rain tomorrow: yes or no. Unfortunately, because the ball is magical, it is regulated, and you are only allowed to access predictions from the ball which have been sanitized using differential privacy. Further assume that you know the mechanism the crystal ball uses has an epsilon value of 1.098. Given that there is, at baseline, a 50% chance that it will rain tomorrow, if your crystal ball answers “yes,” then you can be 75% confident that it will rain tomorrow—and this might be high enough for you to carry an umbrella. The output of the mechanism, even though differential privacy is being used, greatly impacted your decision to carry an umbrella.
On the other hand, assume the crystal ball were using a (.08)-differentially private mechanism to sanitize its future-predicting outputs. If you had an a priori guess that it would rain tomorrow, 50%, and the crystal ball said “yes”—i.e., the same setup from before, with a revised epsilon value—then you would only have gained a 2% boost in confidence. You are now able to say there is a 52% chance of rain tomorrow—and that might not be high enough for you to take an umbrella. In other words, learning the output of this particular (.08)-differentially private mechanism does very little for your choice in umbrella encumbrance.
This fluidity in confidence is what must be translated into legal language. At a high level, lower epsilons mean that the data provided by a mechanism is more sanitized, and a statute that is highly sensitive to the risk of a privacy loss (i.e., risk of reidentification) would be more likely to approve the mechanism’s outputs. However, there are a few important nuances not captured by such a cursory view. Three options may exist for the accurate and portable packaging of a mechanism’s risk of reidentification. Each of these options is discussed in turn.
a. Epsilon Alone
One possibility for translating a mechanism’s legal risk is simply using epsilon alone. On the positive side, this approach places the focus on an easily adjustable quantity, allowing simple changes in epsilon to reposition the legal viewpoint of a mechanism’s sanitization abilities. The downside, however, is that this approach is not very granular. Low epsilons may be considered more private, as “small [epsilons] are happy epsilons,” but distinguishing between an epsilon of .01 versus .05 versus 1.0 would be practically difficult. At the same time, this could impact a decision by a court given that not all data are created equal, and the purposes of data exploration are also not equal (i.e., some objectives are more worthwhile than others). If a court has trouble distinguishing between “small” epsilons, then it could lead to permissible sharing when the risk is, in reality, too high.
Additionally, there is no context provided when considering an epsilon value by itself, which may produce a rubber-stamping effect on certain mechanisms. The quantity being assessed here should be the mechanism’s ability to provide an attacker with a lot or a little information. Simply looking at epsilon alone does not provide a sense for how much information is being gained by the attacker. Indeed, an epsilon value of 1.098 may seem low, but comports with a 25% boost in confidence when observing some outputs. Depending on the particular scenario and an initial suspicion probability, a 25% boost could be an untenable amount of privacy loss. Therefore, epsilon alone is likely a nonideal fit for a legally portable understanding of differential privacy.
b. Upper Bounds
A second option for a legal comparator may be to consider the upper bound produced by a mechanism (e.g., the 75% in Algorithm 2). This approach has the benefit of capturing the worst-case scenario for any users’ data that may be in the dataset. As not all answers provided by a mechanism carry the same amount of risk (e.g., in the randomized response mechanism discussed in Section III.A above, observing a “yes” answer carried the most risk, with an upper bound of 75%), this quantity appropriately captures all possible output, best case and worst case for the attacker.
The downside to this approach, however, is that only the upper limit is taken into consideration. In this way, this measurement may oversell the adversary, leading to a court being more wary of a situation that presents less risk than perceived. For example, at an initial suspicion level of 75% and an epsilon value of one, the attacker ends with an 89.08% upper bound percentage. Although 75% is fairly high to begin with, the epsilon value being used here is in some ways low. Despite this, a nearly 90% upper bound probability is unlikely to be approved by a court looking to protect a user’s data.
In summary, regardless of how it may be beneficial to consider the worst-case scenario given that we would be matching this number with the maximum risk permitted by a statute, this comparator ignores important context like a priori guessing ability, which provides useful context for a court to consider. For this reason, the upper bounds are less likely to be the best fit for the type of legal comparator we are looking for.
c. Guess Difference
A final possibility is to use what we deem the guess difference. The guess difference is the difference between the initial suspicion and the upper bound; in short, taking out what the attacker already knew and only keeping what was learned from the algorithm’s output in the best-case scenario for the attacker. For example, an initial suspicion of 50% with epsilon 1.098 (i.e., Algorithm 2) produces an upper bound of 75%; therefore, we have a guess difference of 25%, the difference between the initial suspicion and the upper bound.
This approach allows us to take into consideration the fact that some questions are more privacy sensitive than others by relying on the upper bound, but tempers this by removing the default guessability of a query. To be sure, in this way, the guess difference may undersell the attacker’s overall guessing ability. For instance, it might seem odd that a high initial suspicion and low epsilon value nonetheless produces a low guess difference score, despite the fact that the attacker had a high likelihood of guessing initially and that guess was only made stronger after seeing the output of a mechanism. Looking closely at the aims of differential privacy, however, shows that this is likely a moot point.
Differential privacy does not concern itself with information not gleaned via the dataset. Imagine that an individual who has a particular disease participates in an experimental drug study where the data from the study is protected using differential privacy. Further imagine that the published results of the study are that the experimental drug increased life expectancy rates by one year. Would we say that differential privacy failed to protect this individual if the individual’s insurance rates are increased after the insurance provider learns of this exact study and its conclusion? No.
The insurance company learned from the broad result published by the study, which differential privacy does not claim to protect. If, on the other hand, the insurance provider increased the individual’s rates after querying the dataset and coming up with some confidence level that the individual was “in” this dataset, meaning the individual had the potentially life-threating disease, then we would say that differential privacy failed to protect the individual. Differential privacy allows us to draw hard lines around how much the insurance company may learn from the data—and guess difference captures that ability. For instance, we may say that the insurance company will never be able to increase a blind guess likelihood by more than 2%; a blind guess that this individual is a smoker cannot be confirmed by querying the data because the likelihood that that guess is correct will never be increased by more than 2%, no matter what result is found in the dataset. Stated otherwise, it would be illogical to conclude, based on the results of any query on this dataset—which the individual is in fact “in”—that the individual’s rates should be increased. The insurance company may nonetheless increase the individual’s rates, but would not be basing this decision on a reliable fact learned from the dataset.
Overall, the guess difference approach provides a singular, but context-filled legal comparator. This quantity highlights differences in risk when epsilon is small, allowing a court to meaningfully interpret the .01 to .05 to 1.0 epsilon range, it incorporates the worst-case scenario for any user who is in a dataset, by working with the upper bound set by a particular epsilon value, and it accords with preexisting considerations of reidentification risk, as discussed further in Section IV.A below. Therefore, we conclude that out of the three options discussed above, guess difference should be the quantity used to interpret the sanitization abilities of an ε-differentially private mechanism from a legal vantage. The following Section generalizes the guess difference as a proxy for a mechanism’s risk of reidentification.
C. Step One: Reidentification Risk vis-à-vis the Guess Difference
Taking these options together leads to the conclusion that guess difference is the most appropriate legal comparator—guess difference may be considered a proxy value for the reidentification risk a mechanism encumbers. This option adequately balances the attacker’s best-case scenario, but tempers that confidence with the a priori guessability of the query. In this way, the measurement does not oversell or undersell the sanitization abilities of a mechanism. This metric will therefore form step one of our two-step test permitting the comparison between what differential privacy provides and data-protecting regulation mandates.
1. Epsilon Visualized
With that in mind, we may visualize a range of popular epsilon values in terms of the guess difference each mechanism provides:
Figure 6. Guess Difference Visualized