chevron-down Created with Sketch Beta.
September 08, 2020

Building a Pragmatic Framework to Advance Data-Driven Healthcare Research and Innovation

By Alea Garbagnati, Esq. and Lauren Wu, Esq., Roche Molecular Solutions, Pleasanton, CA and Dorien Van Doninck, Legal Counsel, F. Hoffmann La-Roche Ltd, Basel Switzerland


Effective healthcare innovation relies on identifying and utilizing data to generate tangible insights, leading to the development of personalized medicine, services, products, and testing (collectively, “products”). Nevertheless, there is a misconception that such data-driven innovation comes at a cost to the privacy rights of individuals. This notion frequently results in unnecessarily restrictive and inconsistent approaches to privacy and data protection laws, and is rooted in the general (and potentially well-earned)1 mistrust by the public of data use and collection under the guise of scientific research. Healthcare companies,2 which are heavily regulated under various data protection and regulatory frameworks, are frequently adversely impacted by legislative reaction to such data misuse, despite the incongruous potential for serving the public good vis-a-vis innovative healthcare products.

The past few years have brought a paradigm shift in privacy and data protection laws. Simultaneously, the drivers for data-driven healthcare research and innovation are more apparent than ever in the midst of the current COVID-19 pandemic.3 To meet public health needs, healthcare companies must find ethical, scalable and practical approaches to data use to advance innovative technologies and products. Without the building blocks for a clear, practical and well-established framework on which healthcare companies can rely when using data, well-intentioned privacy laws will unnecessarily stifle the development of and patient access to novel and personalized healthcare products.4 This article sets forth the elements to build a framework to appropriately navigate the complex web of oft-conflicting privacy and data protection laws and regulations, enabling healthcare companies to harness the mechanisms already built into relevant laws to drive patient-focused innovation and research while simultaneously acting as honorable data stewards in the furtherance of data protection principles. 

Challenges to Data-Driven Healthcare Research and Innovation

Rapidly changing technology, and the pervasiveness of data breaches and data misuse affecting both corporations and individuals, have driven a paradigm shift in privacy and data protection laws. In the wake of newly enacted laws, such as the General Data Protection Regulation (GDPR)5 and the California Consumer Privacy Act (CCPA),6 other similar laws have been proposed, amended or enacted around the world that often mimic or borrow heavily from these trendsetting laws,7 although these new laws are sometimes “restyled” by incorporating significantly different provisions, such as data localization.8 In countries where privacy laws are applied at both a federal (or national) and state (or local) level, each separate jurisdiction may develop its own legal construct, thereby further complicating the complex patchwork of laws applying to personal data. For instance, in 2019 25 U.S. states and some U.S. territories introduced and/or passed over 150 different privacy laws.9While ushering in important and necessary data protection improvements, such as increased accountability requirements, strengthened data subject rights and expanded enforcement authority, these new laws have also created (or, perhaps reinforced) a complex minefield of requirements that introduce a number of challenges to the ability of healthcare companies to conduct data-driven research throughout the world and among diverse patient populations. For example, the definition of deidentification (or its equivalent),10 which frequently differs from one country to another (or from one state to another in the United States),11 makes it difficult to utilize one single deidentification concept. This lack of harmonization impedes development of comprehensive and responsible data strategies, particularly for smaller to midsize companies.12 Beyond the laws themselves, implementing interpretations from regulators, authorities, or - in the context of clinical trials - ethics committees (or institutional review boards) result not just in individual derogations, but also in varying and often inconsistent interpretations that do not sufficiently uphold the intended balance between data protection against other fundamental rights, such as to health and healthcare that are directly dependent on innovation.13 These factors not only impact healthcare companies’ ability to conduct research to meet the public need for data-driven and novel healthcare products, but also their ability to provide suitable and timely treatment to individuals.14

An over-reliance on traditional data processing models may also no longer be suitable in light of the shifting legal requirements. As an example, consent-based data processing models have become increasingly problematic in the healthcare space. This stems in part from the perennial debate about how broad consents can be, particularly when data is used for secondary purposes. It can be even more problematic when consent as a basis for data processing gets confused with consent as the safeguard for human rights, as traditionally occurs in healthcare settings (e.g., informed consent). When used as a legal basis, consent must meet high standards that are not easily attainable for a number of reasons; among them the potential power imbalances between patient and healthcare provider (or study sponsor),15 or the resulting disqualification from receiving healthcare or participating in a study if the data subject withholds consent. Consent also implies a data subject’s ability to withdraw from the past, present and future use of the subject’s data, which may not be possible for the healthcare company to effect where it requires use of that same data to comply with its legal obligations.16 For example, a laboratory may be precluded from immediately honoring a data subject’s erasure request, as it may be required to maintain the personal data for a period of time to retain its certification and/or licensure as a laboratory or meet other legal and/or regulatory obligations.17 Thus, consent as a model under recent laws may be inherently in conflict with the record keeping requirements prevalent in healthcare, and particularly in research.18

Recent initiatives that rely on new types of data to establish safety and efficacy in healthcare products have added another challenge. Increased regulatory expectations for healthcare companies to collect and utilize real world data (RWD) to engage in postmarket surveillance activities, combined with ever-increasing openness to accepting RWD in submissions for product clearance and approval,19 is seemingly at odds with the principles of certain data protection regimes. It is well recognized that RWD can provide a more accurate picture of how different medical interventions and products may impact an individual and the public as compared to traditional surveillance methods.20 While regulators charged with reviewing and considering RWD in relation to such products increasingly rely on and require access to such data, there is no clear, universal path for healthcare companies to appropriately collect, retain and use such data for these purposes, nor to manage data subject requests that might limit processing. Of some benefit to resolving this conflict is the fact that “[u]nlike clinical trials ... the element that spins RWD into ‘gold’ is not per se the individual patient records, but the efforts taken to curate, aggregate, and analyze a large volume of data.”21 Use of RWD in healthcare research and for regulatory purposes requires a number of value-enhancing tasks (such as curating, and/or aggregating/deidentifying the data, cross-linking, and analyzing), some of which are the very same tasks necessary to ensure a data subject’s right to privacy is not at issue.22 Only in limited circumstances would RWD necessarily need to retain some of its identifiable elements, such as for safety or public health purposes. A robust, curated data set requires access to a significant amount of both structured and unstructured data to be viable. Without a strong privacy framework to support the appropriate collection, use and processing of such data, no healthcare company can reach that point.

Building Blocks to “Level” the Need for Healthcare Innovation and Potential Privacy Risks

>Healthcare innovation is sustained by and dependent on the continued availability of patient data to meet researchers’ needs. Indeed, “personal data has a profound and often understated impact on many aspects of healthcare.”23 In corollary, scientific and healthcare research are and have been a critical component of advancing society requiring a commensalism between individual rights and the interests of public health. There is a false dichotomy currently posed where healthcare companies must choose between the privacy rights of individuals and their ability to advance innovation. However, these two options need not be juxtaposed, as both can be accomplished in an ethical and responsible way within the framework of existing privacy laws. Healthcare companies already operate in a very heavily regulated environment, where key controls such as data minimization and data subject access are common practices and embedded into existing policies and procedures. Furthermore, patients are largely supportive of sharing their data for research purposes, including to healthcare companies.24

Reaching this crucial balance is not something that healthcare companies can do alone; it requires buy-in from and cooperation with other stakeholders, including regulators and patients. It also requires recognition of the role that healthcare companies take on as ethical data stewards with legitimate and necessary needs for data access that differentiate such companies and data uses from others.25 To ensure that the privacy and data protection rights of individuals do not come at the cost of their health and care, efforts should be made by regulators and legislators to harmonize new data protection requirements with existing privacy frameworks or to incorporate exemptions or safe harbors for data uses into any new laws where similarly sufficient protections are already in place.

The following sections will set forth the building blocks to form a framework that can be built upon existing privacy and data protection laws to enable ethical data use in the furtherance of innovation and public health.

Define Healthcare Research to Promote Ethical Data Use

Research is and will continue to be the cornerstone to innovation for healthcare companies. The increasing dependence on and importance of secondary data use raises the question of how much data processing should be permitted in the name of research.26 This can be even more challenging when healthcare companies must depart from more traditional areas of data use (e.g., clinical studies; postmarket surveillance) that are safeguarded by controls and industry standards to protect the rights, dignity and safety of human research participants and patients. Those same mitigating factors are often not present in the ever-increasing avenues of secondary research where institutional safeguards, such as ethics committees or institutional review boards, and standards, like the U.S. Federal Policy for the Protection of Human Subjects (i.e., the Common Rule), Good Clinical Practice (GCP) Guidelines or other human subject protection requirements, may not necessarily operate or apply. The underlying concern that the definition of “research” may be stretched too far has hindered implementation of existing laws and the development of new fit-for-purpose laws, simultaneously preventing widespread adoption of research-enabling provisions.27 These concerns ignore the fact that even for-profit or commercial companies performing research may be doing so in the interest of the public good, and not necessarily to the detriment of individual privacy rights.

One way to integrate more complex data use in healthcare research in balance with the individual right to privacy is to develop an industry-wide, global definition of “healthcare research” that is practical, but still provides reasonable limits to data use, including any secondary or further data use. Such a definition is currently lacking. For example, despite the provisions in GDPR that give deference to research,28 there is no definition of the term at the European Union (EU) level.29 This is comparable in the United States, where the definition of “research” under the Common Rule differs with definitions at the state level and even under FDA, and is limited in application.30 Employing a solid, harmonized definition of healthcare research, whether as an industry or in collaboration with regulators, can address some of the concerns of regulators and authorities in implementing research-related components of laws, like GDPR. A definition of healthcare research31 should encompass applied and fundamental research activities and support technological advancement, while incorporating ethical standards. Any attempt to define the term would necessarily require the input of relevant stakeholders, including healthcare companies, to ensure that the ultimate output is fair, actionable and practical, but also capable of being distinguished among the different types of research that may be classified under such a term. By employing a clear and practicable definition of healthcare research, and creating a companion method to remove identifiers from the personal data that are unnecessary to meet stated scientific purposes, healthcare companies may be able to utilize the research exemptions already existing in a number of privacy and data protection laws to their full potential.

Apply Practical but Strong Forms of Anonymization / Deidentification

When data has been rendered so that the individuals associated with the data can no longer be identified, that data is typically not in scope of applicable privacy laws. The specific methods of achieving this deidentification vary almost as frequently as the definitions for the terms describing these activities, i.e., “anonymization” / “deidentification.”32 There is a lack of consistency under applicable privacy laws in establishing the point at which the risk of re-identification has sufficiently dissipated. Adopting a clear, practical, and harmonized approach to deidentification supports data-driven research and healthcare innovation, while promoting the principle of data minimization, a concept already embedded in the privacy requirements applicable to healthcare companies. These techniques may also serve additional purposes; for instance, as a control mechanism to meet some of the challenges of cross-border data transfers, particularly those between the EU and United States, which was made more cumbersome and opaque in light of the recent decision of the Court of Justice of the European Union (CJEU) in “Schrems II”.33

Instead of setting forth a specific and prescriptive method, a practical approach to deidentification should look at re-identification risks relative to the entity (or entities) responsible for the data. An approach of this sort, rather than one that only considers deidentification with no possibility of re-identification, is supported by the CJEU’s Breyer decision,34 and permits a more principled and ethical approach to deidentification in general. The expert method under HIPAA is more akin to this risk-based approach than the safe harbor method;35 but does not go far enough in that the method inherently creates resource, process, and access challenges that do not necessarily support a sustainable path for all proposed data uses by healthcare companies. Instead, healthcare companies should be able to create and systematize a consistent and repeatable approach that can be performed in-house, with the appropriate data protections in place. Factors that should contribute to a risk analysis should consider the existence of lawful means of re-identification, implementation of technical and organizational measures (TOMs) to protect and safeguard the data (e.g., contracts and downstream limitations, and, where done properly, internal firewalls), and mitigating controls to sufficiently reduce the risk of re-identification. A risk-based approach to deidentification can be layered with specific requirements to meet local law; for example, the requirements for either method of deidentification under HIPAA. These mitigating factors, even when placed on top of strongly pseudonymized data (i.e., data of which indirect identifiers have been modified/encrypted/hashed) should render data sufficiently anonymous under GDPR and other similar standards, including HIPAA deidentification. This framework is not static, however, and must be subject to frequent re-assessments in light of factors such as new technology, new data (including additional commingling of data), publication of said data, or other external factors - some of which may not be within the control of the healthcare company.

Adopt More Scalable Data Use Models Beyond Consent

Consent is not the magical unicorn of data processing and should not be used as the bandage to fix data processing problems.36 The misconception that consent is superior and should be preferred over the other legal bases set out in GDPR (and similar laws)37 has led to an over-reliance on consent for data processing. As discussed above, this is problematic when the technical, black letter requirements for consent are extremely difficult to meet. Additionally, broad consent models create challenges in secondary data use cases where the future purpose(s) or data processing are unknown at the time the individual provides consent, such as for biobanks or downstream/retroactive data analytics.38 The writing is on the wall that while consent may be the more established and preferred model, consent should be what organizations rely on when other legal bases are unavailable and not a default or “catch-all”.39  Thus, legislators should either adopt alternative options to support scalable, responsible data use models for healthcare research, such as increased adoption or acceptance of alternative bases for data processing (like healthcare research exemptions or adopting models similar to the institutional review board waiver model). Alternatively, requirements for the use of consent should be revisited with the specific purpose of refitting consent to include secondary data use, such as by accepting the use of broad consent or providing model language better suited to permit further use.

Utilize Existing Privacy Constructs that Provide a Sufficient Basis for a Risk-based Framework

In countries where some form of comprehensive data protection laws is already in place, the discussion can turn from implementation of existing law to whether additional laws are needed to govern specific types of data or data uses. This reconstructive approach fatally ignores the idea that existing legal frameworks may already accommodate some - but not all - application of even novel data types and use. Instead, regulating authorities and stakeholders should jointly draft guidance for implementing existing mechanisms in specific settings (like healthcare), focusing such guidance on enabling these data use cases without creating barriers that impede efforts to put the guidance into effect. One example of such a collaboration is the European Medicines Agency (EMA) policy on publication of clinical data (Policy 0070).40 Drafted with healthcare industry input, the policy provides guidance for implementing a risk-based approach to anonymization and was found to be sufficient for providing “adequate privacy protection for patients.”41

Another alternative would be to expand existing laws or regulations (or implementing guidance) governing healthcare companies and/or healthcare-related data to accommodate the novel business types and ways that healthcare companies interact with data and data subjects. There are a number of healthcare-related research activities that do not squarely fit existing privacy or data protection laws, but where the healthcare companies engaging in these activities nevertheless abide by those requirements to meet internal or industry standards. An example in the United States is where a healthcare company does not meet HIPAA’s classic definition of a “covered entity” or “business associate,” nor does it handle “protected health information,” but nevertheless meets the various other elements of HIPAA in the conduct of its operations, including research. Even if these “HIPAA-adjacent” entities meet the law’s requirements, they are unable to directly avail themselves of some of the advantages of being in scope of HIPAA,42 and must instead rely on other privacy frameworks.43 Similarly, whereas the Common Rule applies only to federally funded research,44 many healthcare companies in the United States broadly follow Common Rule requirements without the added benefit of the flexibility provided under that framework, such as potential exemption from other applicable laws.45 Finding an appropriate way to formally include these additional and perhaps peripheral activities under the umbrella of existing law or regulation even when such laws and regulation do not directly apply (i.e., when the research activities and operations of the healthcare company meets the requirements in all but name), can also contribute to building this risk-based framework without adding to the already complex patchwork of laws.

The hindrance for adopting this approach comes down to the need for healthcare companies and researchers in general to earn the trust of regulators, consumers, and the general public over how and why they use personal data. Healthcare companies will also need to demonstrate that the data they use and share are not subject to types of mass surveillance at issue in Schrems II. With deidentification, the concern - bolstered by a number of sometimes conflicting studies46 - is that the benefitting party or a downstream recipient could in the future attempt to re-identify the data. This concern, however, ignores the other mechanisms in place to prevent such behaviors and the obligations on companies to continue monitoring existing TOMs. Acknowledging and addressing this fundamental trust issue is an imperative foundation that healthcare companies must build to construct a suitable risk-based framework to support data use in healthcare research.

Trust as the Cornerstone to Building a Framework to Facilitate Healthcare Research and Innovation

Private industry has trust issues when it comes to data use, something that is reflected in major decisions by courts and regulators,47 and evident in the frequency of articles in the media describing data misuse. This apprehension to trust is perhaps somewhat warranted. If society is expected to entrust healthcare companies with its sensitive health data, these companies must demonstrate that they can act responsibly with data.

Protect Data and Reduce Re-identification Risk with Strong Technical and Organizational Measures

The implementation of strong TOMs is key to ensuring the protection of personal data and, ultimately, to gaining and maintaining the trust of the public. While adequate technical and security controls are already required by most privacy laws, and help to mitigate risk,48 the failure to have adequate controls - or use of puffery to describe those controls - can seriously hinder a company’s ability to earn trust. Technical measures to safeguard data from accidental or unlawful destruction, loss, or unauthorized access by third parties and other privacy-enhancing techniques should be risk-based, monitored and updated based on changing technology, and can include deidentification practices, blockchain technology, pseudonymization and encryption.49 These measures can also reduce the risk of or prevent data re-identification.

Internal data governance practices are also crucial for proper data management, and can enable responsible secondary data use and data sharing. The implementation of data governance boards, for example, can help facilitate such practices and even help reduce the risk that deidentified data is recombined to facilitate re-identification. These practices should be adopted or strengthened by healthcare companies to foster trust because it puts companies in a position to set and enforce their own standards of data management, ultimately requiring companies to be more mindful of why, how, when, and what data they process.

Introduce Industry Level Accountability Efforts and Strengthen Internal Accountability

Many existing privacy laws and standards (such as the GCP Guidelines) associated with healthcare companies already impose accountability requirements that are enforceable against them. Such measures may include requirements to maintain records of processing, conduct privacy impact assessments, and handle personal data in accordance with established principles. This may not be enough, however, to gain buy-in from the general public and legislators for widespread data-driven healthcare research and product development.

Industry-level accountability measures, such as self-certification frameworks, offer a middle ground that limits overregulation while allowing healthcare companies to gain an additional layer of “trustworthiness.” Certification programs and codes of conduct are solutions that can address this longer term, so long as they are properly administered and have the backing of regulators.50 An enforceable self-regulatory framework may also be a viable longer-term solution and can perhaps assuage concerns that some have with so-called HIPAA-adjacent data processing in the absence of new or amended regulations.51 If done properly and in conjunction with the right stakeholders, these tools may also serve to mitigate some of the concerns raised by the CJEU in Schrems II, particularly given that many data processing activities conducted by healthcare companies are less likely to be the target of government surveillance. Industry-based frameworks, like the Payment Card Data Security Standards (PCI-DSS) in the finance industry, have been generally successful.52 There are some general frameworks (e.g., ICO 27001) and industry-specific frameworks (e.g., HITRUST) that could cover some data processing activities by healthcare companies; however, these frameworks do not cover all of the data processing activities in scope for such companies and may not be scalable globally.53 In the absence of a suitable alternative, it is also not a practical solution to have healthcare companies wait until these measures exist. In the absence of any readily available solutions of this nature, healthcare companies may need to rely on the other measures.

Promote Public Awareness and Education as a Means to Bolster Transparency

Transparency is a key data protection principle codified universally in privacy laws. Transparency is also an important factor for building and attaining the trust of the public and regulators, as the absence of information about data use can lead to unwarranted speculation. For data-driven healthcare research, this translates to increasing awareness about how and why data will be used and data subjects’ rights regarding their data. Transparency, therefore, is a good tool for data subjects and regulators to hold companies accountable for data misuse, and thus a building block for any framework that enables healthcare research and innovation.

However, achieving transparency about data use in healthcare research is challenging, particularly with secondary data use cases where it may be difficult to provide adequate notice to data subjects. This is even more true with regard to RWD, where the data is not collected from the data subjects themselves. It may be more prudent to explore novel means of providing adequate information to data subjects, rather than relying on traditional methods. For example, GDPR provides an exception to the notice requirements when notice is impossible or would involve a disproportionate effort.54 When the processing of data is conducted for healthcare research purposes, researchers might be able to rely on this exception to enable secondary use or use of banked tissue samples, provided that appropriate safeguards are implemented. Promoting acceptable non-traditional notice offers healthcare companies a viable avenue to ensure that the public understands how researchers are using such data, as well as the benefits of such research and data use. This can be accomplished, for example, through public education and awareness campaigns, whether organized by a single healthcare company or industry organizations. The COVID-19 pandemic has created a storyline that could provide a more tangible message to help the general public, legislators, and regulators understand the importance of all sorts of data and data use cases in healthcare research, and not just the data traditionally associated with clinical trials. Guidance from regulators on how to properly disseminate information in this manner, as well as how to partner on such efforts, could also add clarity and ensure that information provided to the public is adequate and non-misleading, and is generally in the furtherance of public health.


Data-driven healthcare research and privacy should not be seen as polarizing forces, but rather complementary and necessary components of healthcare innovation that can be bridged by responsible and ethical data use practices. Building a responsible framework that enables scalable and ethical data-processing should not be a Herculean task given that many of the tools needed to slay the proverbial Hydra can be found in existing laws, standards and practices. Other such tools can be forged from the long history healthcare companies have as a trusted data steward in many of their more traditional data processing activities. Given the criticality of continued innovation in the healthcare space, true collaboration between the impacted stakeholders is required to ensure the synergies found in leveraging a risk-based framework to approach data-driven healthcare research and innovation. Healthcare companies need to work both independently and as an industry to earn the trust of regulators and the public, while lawmakers and regulators need to support a framework that promotes consistency and scalability while allowing for an acceptable level of risk. At the end of the day, a suitable, risk-based framework can help the future needs of patients while protecting their current interests and rights with respect to their data. 

  1. See e.g., Minssen,T., Rajam N., & Bogers, M., Clinical Trial Data Transparency and GDPR Compliance: Implications for data sharing and open innovation, Sci. & Pub. Pol’y, scaa014 1, 6 (2020), (attributing the Cambridge Analytica and Facebook scandal for increased scrutiny on privacy and data protection practices).
  2. For this article, “healthcare companies” will collectively refer to diagnostic, pharmaceutical, biotech, and medical device companies.
  3. See Fink, S. & Baker, M., “It’s just Everywhere Already”: How Delays in Testing Set Back the U.S. Coronavirus Response, N.Y. Times, Mar. 10, 2020, (“But the Seattle Flu Study illustrates how existing regulations and red tape — sometimes designed to protect privacy and health — have impeded the rapid rollout of testing nationally...”).
  4. See e.g., Rabesandratana, T., European Data Law is Impeding Studies on Diabetes and Alzheimer’s, Researchers Warn, Sci. Magazine, Nov. 20, 2019,
  5. European Parliament and Council of European Union (2016) Regulation (EU) 2016/679 (hereinafter GDPR).
  6. Cal. Civ. Code § 1798.11 et seq, as amended and its implementing regulations (hereinafter CCPA).
  7. “2019 Consumer Privacy Legislation,” National Conference of State Legislatures (NCSL) (Jan. 3, 2020), (last visited July 27, 2020) (hereinafter NCSL 2019) (listing the draft bills proposed in 2019). 
  8. GDPR Goes Global: The Case of Brazil and India, Access Partnership (Sept. 19, 2018), (providing updates on two laws modeled after GDPR, including the draft law in India, which includes data localization provisions). Data localization requires personal data (or in some cases, aggregate data) to be stored within the country of origin, which can affect, for example, data transfers outside of the country and the ability to use cloud-based data storage.
  9. See NCSL 2019. The trend has carried through to 2020, with over 30 states and some U.S. territories considering such bills, though the COVID-19 pandemic has largely hindered - or in some cases refocused those efforts.  See “2020 Consumer Data Privacy Legislation,” NCSL (June 29, 2020), (last visited July 27, 2020).
  10. For this article, “deidentification” refers to deidentification and parallel terms (e.g., anonymization) unless discussing the specific standard.
  11. In fact, the United States cannot even decide on a single and consistent definition of “deidentification.” Compare Health Insurance Portability and Accountability Act of 1996, P.L. No. 104-191, 110 Stat. 1938 (1996) (hereinafter HIPAA) (establishing a definition, as well as two methods to accomplish the definition - (i) “safe harbor” requiring removal of 18 identifiers and (ii) “expert determination” where an external expert assesses the statistical risk of re-identification), with Cal. Civ. Code § 1798.11 et seq. (establishing a new definition of the term, and providing a three-pronged deidentification analysis considering whether the company has reasonable means of re-identification). To add complexity, as recently as June 2020, the state of New York, in a proposed bill specific to contact tracing during the COVID-19 pandemic, has offered another potential definition of the term. See N.Y. S. 8448 (N.Y. 2020) (adding additional qualifiers to the CCPA definition).
  12. Beckerman, M., Opinion, Americans Will Pay a Price for State Privacy Laws, N.Y. Times, (Oct. 14, 2019),
  13. Garbagnati, A., Van Doninck, D., & Schroeder De Castro Lopes, B., Unlocking the Potential of Scientific Data-Driven Research, 2 Life Sci. Recht 58, 61-62 (2020).
  14. Determann, L., Healthy Data Protection, 26 Mich. Tech. Law Rev. 229, 235 (2020).
  15. European Data Protection Board (EDPB), Opinion 3/2019 concerning the Questions and Answers on the interplay between the Clinical Trials Regulation and GDPR, Jan. 23, 2019 1, 6,
  16. Id. at 6-7. 
  17. See e.g., Clinical Laboratory Improvement Amendments (CLIA), 42 U.S.C. § 263a (regulates laboratory testing in the United States and sets forth specific records retention requirements); 45 C.F.R § 164.316 (2003) (imposes a six year retention requirement on certain records and data sets); in California, depending on the type of medical record, provider, and/or patient, different retention periods apply, ranging from three years for laboratory medical records (CA Bus. Prof. Code 1265(j)(2)) to 10 years for Medi-Cal patients (CA Welfare & Inst. Code § 14124.1).
  18. See generally, Roberts, K. & Kohoutek, E., A Need to Know Basis, PharmaTimes Magazine, (Mar. 2019),
  19. See generally, U.S. Department of Health and Human Services, Food and Drug Administration (FDA), Center for Drug Evaluation and Research (CDER), Center for Biologics Evaluation and Research (CBER), Draft Guidance, Guidance for Industry: Submitting Documents Using Real-World Data and Real-World Evidence to FDA for Drugs and Biologics, May 2019 (defining RWD as “data relating to patient health status and/or the delivery of health care that are routinely collected from a variety of sources”; e.g., from electronic health records, claims and billing activities, product and disease registries). See also International Pharmaceutical and Medical Device Privacy Consortium (IPMPC), White Paper: The Role of Personal Data in Healthcare, 2 (Apr. 2019) (hereinafter IPMPC White Paper),
  20. Katkade, V.B., Sanders, K.N., & Zou, K.H., Real World Data: An Opportunity to Supplement Existing Evidence for the Use of Long-established Medicines in Health Care Decision Making, 11 J Multidiscip Healthc., 295, 295 (2018). 
  21. Garbagnati et al. supra n. 13 at 60-61.
  22. Id. (citing Miksad, R. A. & Abernethy, A. P., Harnessing the Power of Real-World Evidence (RWE): A Checklist to Ensure Regulatory-Grade Data Quality, 103 Clin Pharmacol Ther. 202, 204 (2018)).
  23. IPMPC White Paper supra n. 19 at 1.
  24. Yaraghi, N., Who Should Profit from the Sale of Patient Data?, Brookings: Techtank (2018),
  25. See e.g., IPMPC, Personal Data’s Role in Advancing Health Care Innovation, (July 22, 2020), (illustrating the symbiosis between use of and access to data and healthcare innovation, which can be maintained by leveraging the data protections already in place in healthcare and healthcare research).
  26. European Data Protection Supervisor (EDPS), Preliminary Opinion on data protection and scientific research, 7 (Jan. 6, 2020),
  27. See e.g., id. (expressing concern for adoption of GDPR’s research provisions by commercial organizations).
  28. In the European Union (EU), the focus should be more on defining “scientific research” to be more in line with the Article 9 derogation in Article 9(2)(j) GDPR.
  29. GDPR does not define the term “research,” but instead states that it should be defined broadly. See GDPR Recital 159. A handful of – but not all – EU Member States define research in their derogations, but there is no definition at current time that supports the term at an EU-level. See e.g., Irish Health Research Regulations, S.I. No. 314 of 2018.
  30. Compare 45 C.F.R. § 164.501 (defining research under HIPAA; adopting the Common Rule standard 45 C.F.R. § 46.102, which applies only to federally funded research), with Cal. Civ. Code § 1798.140(s) (defining research under CCPA); see also FDA, Comparison of FDA and HHS Human Subject Protection Regulations (last updated Mar. 13, 2018),
  31. Garbagnati et al, supra n. 13 66-67 (proposing a definition for GDPR purposes that can be expanded to a more global definition of healthcare research).
  32. See supra n. 11.
  33. Case C-311/18, Facebook Ireland Ltd v. Maximillian Schrems, 2020 E.C.J. (hereinafter Schrems II) (limiting the use of Standard Contractual Clauses as a mechanism to permit data transfers out of the EU/EEA to where an assessment of the transfer and the suitability of controls has been conducted and, perhaps more notably, invalidating the Privacy Shield as a mechanism for facilitating transfers from the EU to the United States).
  34. Case C-582/14, Patrick Breyer v. Bundesrepublik Deutschland, 2016 E.C.J. (qualifying that the reasonable means of re-identification in an anonymization analysis must be legal in addition to reasonable; e.g., a company does not need to anticipate the possibility of a cyberattack when undergoing the analysis. It is worth noting that by prohibiting re-identification in its data protection law, the United Kingdom also uses a similar risk-based approach. See Data Protection Act 2018 c.12, § 171 (Eng.).
  35. 45 C.F.R. § 164.514(b). See supra n. 11 for a description of the two methods of de-identification under HIPAA.
  36. See also DeVries, L., Opinion, Just Say Yes: GDPR Consent is not as Simple as it Seems, International Association of Privacy Professionals (IAPP) (Oct. 1, 2019), (“Consent is not by any means a magic wand that can be used to process personal data.”).
  37. Under GDPR, data must be processed according to a legal basis (Article 6), and/or a corresponding derogation for sensitive data (Article 9).                             While both articles list consent as an appropriate legal basis, both also list a number of other equally viable legal bases, such as when processing is required by law (GDPR Article 6(1)(c)) or where the processing is necessary for scientific research (GDPR Article 9(2)(j)). Newer GDPR-like laws, such as Brazil’s data protection law, use a similar approach. See Monteira, R.L., The New Brazilian General Data Protection Law – a detailed analysis, IAPP (Aug. 14, 2018),
  38. Peloquin, D., DiMaio, M., Bierer, B. & Barnes, M., Disruptive and Avoidable: GDPR Challenges to Secondary Research Uses of Data, 28 Eur J Hum Genet 697, 700 (220). See also Roberts & Kohoutek, supra n. 18. In fact, some regulators even discourage broad consent. See e.g., Article 29 Working Party, Guidelines on Consent under Regulation 2016/679, Nov. 28, 2017 (revised Apr. 10, 2018),, 28. The Article 29 Working Party was an advisory body made up of a representative from the supervisory authority of each EU Member State, the EDPS and the European Commission. It ceased to exist with the implementation of GDPR and was replaced by the EDPB.
  39. See DeVries, supra n. 36. See also supra above (discussing the challenges with consent and highlighting the need for a new approach).
  40. EMA, External guidance on the implementation of the European Medicines Agency policy on the publication of clinical data for medicinal products for human use v1.4, Nov. 9, 2018.
  41. Branson, J., Good, N., Chen, J., Monge, W., Probst, C., & El Emam, K., Evaluating the Re-identification Risk of a Clinical Study Report Anonymized Under EMA Policy 0070 and Health Canada Regulations, Trials, 21: 200 (2020).
  42. See e.g., Cal. Civ. Code § 1798.145(c), providing exemption from CCPA for data and entities governed by HIPAA.
  43. For example, many (if not most) data processing activities of medical device manufacturers and pharmaceutical companies are not in scope of HIPAA, as such entities are typically not “covered entities” and may only occasionally act as “business associates”. More often, the data collected by these companies come in the context of research and clinical trials (where such data is typically deidentified, or at least pseudonymized), complaints (such as adverse event reporting), device registration, patient support programs, and registries, all of which processing activities likely rely on application of other state, federal, or sectoral privacy law or regulation, and are not necessarily governed by HIPAA.
  44. Federal Policy for the Protection of Human Subjects, 82 Fed. Reg. 7149, 7155 (Jan. 19, 2017).
  45. CA Civ. Code § 1798.145 (exempting from application of CCPA data from research subject to the Common Rule, or pursuant to GCP Guidelines or FDA human subject protection requirements).
  46. Compare Rocher, L., Hendrickx, J.M. & de Montjoye, Y., Estimating the Success of Re-identifications in Incomplete Datasets Using Generative Models, 10 Nature Commc’ns 3069 (2019), with Branson supra n. 41.
  47. See e.g., Schrems II.
  48. Determann supra n. 14 at 270.
  49. Id.
  50. Belfort, R., Bernstein, W.S., Dwokowitz, A., Pawlak, B. & Yi, P., A Shared Responsibility: Protecting Consumer Health Data Privacy in an Increasingly Connected World, Manatt Health (June 2020),, 15.
  51. Id. at 17.
  52. Id. at 22.
  53. For example, these would not qualify as codes of conduct under GDPR Article 40, which are offered under the law (subject to approval by relevant supervisory authorities) to support and/or enhance industry or sector-specific compliance with the law, particularly with regards to industry or sector-specific issues that are not addressed in the law.
  54. GDPR Article 14(5)(b).
The material in all ABA publications is copyrighted and may be reprinted by permission only. Request reprint permission here.

About the Authors

Alea Garbagnati, Esq., CIPP/US, is an attorney at Roche Molecular Solutions (RMS), located in Pleasanton, Calif. She joined Roche in 2015, and spent a year in Roche’s Global Privacy Office in Basel, Switzerland and was the U.S. Privacy Officer for RMS for two years. Prior to Roche, Ms. Garbagnati was a privacy consultant at Deloitte & Touche. She is an alumna of the University of California, Irvine, and Hastings College of the Law. When she’s not acting as a privacy Jedi master, Ms. Garbagnati loves to travel, watch musicals and is an avid soccer fan. Many thanks go to her supportive family, including her husband and young son (who will hopefully include "pseudonymous" among his first words).  She may be reached at [email protected].

Lauren Wu, Esq
., CIPP/US, is Senior Counsel at Roche Molecular Solutions (RMS), located in Pleasanton, Calif. She joined Roche in 2016, and was responsible for regulatory and healthcare fraud and abuse law, as well as reimbursement and privacy. In 2017, she transitioned her role to become lead of the RMS Data Protection & Privacy team and, thereafter, U.S. Privacy Officer. Prior to joining Roche, Ms. Wu worked as Senior Corporate Counsel, U.S. Privacy Officer, and Interim U.S. Compliance Officer at Genomic Health, Inc. (now Exact Sciences), and was an associate at Ropes & Gray LLP and legal / legislative assistant at Sidley Austin LLP. She is an alumna of the University of Southern California and Northwestern University School of Law. A former ballerina and football recruiter, Ms. Wu enjoys cooking, wine, gardening, and spending time with her husband and two beautiful daughters (one of whom could say "GDPR" before the age of 1). She may be reached at [email protected].

Dorien Van Doninck
is a Belgian qualified lawyer with more than 10 years of experience, both in private practice and as in-house legal counsel. Currently she is the Global Privacy Counsel in the group legal department of F. Hoffmann-La Roche Ltd located at the headquarters in Basel, Switzerland. Prior to Roche, Van Doninck was Legal Counsel for Euroclear, a Belgium-based financial services company. She started her career in a Belgian law firm, Buyle legal, after she graduated from the LLM of the Université Patheon-Assas in Paris. Besides the law, Ms. Van Doninck is also passionate about Pilates, skiing and traveling.  She may be reached at [email protected].