chevron-down Created with Sketch Beta.
August 28, 2023

Privacy Challenges at the Intersection of Interoperability and Big Data

By Adam Greene and Iliana Peters

Click here for the audio version of this article

As the healthcare sector has transitioned from paper records to electronic health records, the federal government is now pushing for improved interoperability and data flows through a variety of mechanisms, such as requirements for application programming interfaces (APIs) and prohibitions on “information blocking.” These efforts may lead to unprecedented opportunities to collect and analyze large volumes of healthcare data. Such “big data” potentially can revolutionize the healthcare sector. Yet the bigger the data, the higher the risks when it comes to privacy. The following article will discuss some of the biggest interoperability initiatives, identify the corresponding privacy challenges, and analyze the potential use of the data for research and commercialization.

New Interoperability Requirements

The 21st Century Cures Act Information Blocking Rule

One of the federal government’s biggest tools for unlocking health data is the Information Blocking Rule that the Office of the National Coordinator for Health Information Technology (ONC) has promulgated pursuant to the 21st Century Cures Act. The Information Blocking Rule governs three types of actors: (1) healthcare providers, (2) health IT developers of certified electronic health record technology (health IT developers), and (2) health information networks or health information exchanges (HIN/HIEs). The rule defines information blocking as a practice that is likely to interfere with access, exchange, or use of electronic health information (EHI), unless the practice is required by law or falls under a regulatory exception. For a health IT developer or an HIN/HIE, a practice is only information blocking if the actor knows, or should have known, that the practice is likely to interfere with, prevent, or materially discourage access, exchange, or use of EHI. For a healthcare provider, a practice is only information blocking if the healthcare provider knows that the practice likely to interfere with, prevent, or materially discourage access, exchange, or use of EHI and that the practice is unreasonable. The rule defines EHI as electronic protected health information to the extent that it would be included in a designated record set (as those terms are defined under HIPAA. except extending the “designated record set” to include records maintained on behalf of actors regardless of whether they are HIPAA covered entities).

The impact of the Information Blocking Rule is that actors generally lose discretion as to whether to provide EHI in response to a request (although a request is not necessarily required for a practice to constitute information blocking). If privacy law permits the disclosure of EHI and an exception does not apply, then an actor generally must provide the requested EHI.

This has the potential to lead to substantially increased access to EHI in areas such as research, potentially resulting in the availability of more robust data sets.

While the applicability date for the Information Blocking Rule was April 5, 2021, at this time, there is still not enforcement of the rule. The U.S. Department of Health and Human Services (HHS) Office of Inspector General (OIG) issued a proposed rule on April 24, 2020, governing enforcement of the Information Blocking Rule with respect to health IT developers and HIN/HIEs, and the OIG is expected to finalize this rule in the coming months. HHS has not yet promulgated a proposed rule regarding enforcement of the Information Blocking Rule with respect to healthcare providers, but the proposed rule is expected later this year.


An application programming interface, or API, is a software application that allows other different software applications to communicate with each other, particularly to share information. In other words, an API is a “middle man” that communicates requests and responses from one software application to another. Most APIs are used to facilitate transfers of specific information between two different software applications as well. The Information Blocking Rule requires the use of secure APIs, pursuant to the HL7® Fast Healthcare Interoperability Resources (FHIR)® standard, for purposes of facilitating the exchange of information for compliance with other Information Blocking Rule requirements.

A simple way to understand an API is to think of it as a server in a restaurant. The server goes to a table of restaurant patrons and takes their orders: two hamburgers, two orders of fries, one vanilla shake, and one chocolate shake. The server then walks back to the kitchen and delivers the order to the chef, who then puts the order together and gives it to the server. The server then takes the order back out to the patrons and delivers it. Again, all that the server delivers is two hamburgers, two orders of fries, one vanilla shake, and one chocolate shake; the server does not bring anything else from the cook or the kitchen, such as hot dogs, pizza, or a banana split. In the same way, an API acts as the go-between for two software applications, providing the request for certain information from the first application to the second application, and providing that specific information, and no more, back to the first application from the second application.  For example, if a patient has a personal health record (PHR) application that can interact using an API with the patient’s healthcare provider’s electronic health record (EHR), then the PHR can ask for a lab test result from the EHR, using the API, and the EHR can deliver the result back to the PHR, using the API.

Of course, given the important role the API plays in communicating between applications and sharing information, sometimes very sensitive information, any API must be developed and implemented with robust data privacy and security controls in mind, tested in a robust manner before implementation, and revised and updated as necessary to ensure that they remain secure. Common security issues with APIs include insufficient log capabilities, broken authentication, provision of excess information, lack of resource and rate limiting (which can lead to denial of service [DOS] attacks), vulnerability to injection attacks, and other security misconfigurations or similar issues. That said, APIs can promote collaboration and information sharing for important purposes, including for patient involvement, research, business operations improvements, opportunities for collaboration with the healthcare sector, and secure and efficient exchange of information more generally.

The Trusted Exchange Framework and Common Agreement (TEFCA)

A third tool in the federal government’s push for greater interoperability is the TEFCA, a framework better enabling nationwide health information exchange. The 21st Century Cures Act directs ONC to “develop or support a trusted exchange framework, including a common agreement among health information networks nationally.” The Trusted Exchange Framework (TEF) describes a common set of non-binding, foundational principles for trust policies and practices that can help facilitate exchange among health information networks. These principles are: (1) standardization; (2) openness and transparency; (3) cooperation and non-discrimination; (4) privacy, security, and safety; (5) access; (6) equity; and (7) public health. The TEF describes each principle in greater detail.

The Common Agreement for Nationwide Health Information Interoperability (Common Agreement) provides a template agreement between a “Recognized Coordinating Entity” (REC) and participating “qualified health information networks” (QHINs). The REC is a non-profit entity (the Sequoia Project) that ONC selected for a four-year grant award to develop, update, implement, and maintain the Common Agreement and work with ONC to designate and monitor QHINs. The Common Agreement includes terms to facilitate exchange between QHINs, including terms that QHINs in turn must flow down to their participants.

The ultimate goal of the TEFCA is that all HIEs will become interconnected and participation in any one HIE will essentially serve as an onramp to connecting to all other HIEs and their participants. On February 13, 2023, ONC announced the first six HIEs that ONC and the REC have approved to implement TEFCA as prospective QHINs. Once fully onboarded, these HIEs will officially become QHINs. They have committed to a 12-month go-live timeframe. Accordingly, health information exchange between QHINs may begin in 2024, hastening a new level of nationwide access to health information. It may be some time, however, until TEFCA fully supports population-level data exchange.

Privacy Challenges of Data Sharing

While new interoperability tools create new opportunities to aggregate and analyze data, significant privacy challenges remain. At the federal level, there are primarily four privacy laws governing health information: (1) the Standards for Privacy of Individually Identifiable Health Information (Privacy Rule), promulgated pursuant to the Health Insurance Portability and Accountability Act of 1996 (HIPAA); (2) the Confidentiality of Substance Use Disorder Patient Records at 42 C.F.R. Part 2 (the “Part 2 Rule”); (3) Section 5 of the Federal Trade Commission (FTC) Act, prohibiting unfair and deceptive trade practices; and (4) the FTC’s Health Breach Notification Rule.

Privacy Rule

The Privacy Rule readily permits use and disclosure of protected health information (PHI) for treatment, payment, and healthcare operations. The Privacy Rule generally requires individuals’ authorizations, however, for HIPAA covered entities or business associates to use or disclose PHI for research. The primary exception to this authorization requirement is that an institutional review board (IRB) or privacy board that meets certain criteria may waive or alter the Privacy Rule’s authorization requirement in certain circumstances. Relying on an IRB’s waiver of authorization is one of the most frequent means of collecting large volumes of PHI for research. Some of the challenges associated with this permission, however, are questions as to what constitutes “research” (for example, whether commercial, non-published research is considered “research” under the Privacy Rule) and potential politics surrounding IRBs (for example, a covered entity may only be willing to disclose PHI pursuant to its own IRB’s waiver).

Another basis to aggregate health data under the Privacy Rule is through de-identification. The Privacy Rule permits a covered entity to de-identify PHI (or authorize its business associate to do so) as part of its permitted healthcare operations, whether or not the de-identified information is to be used by the covered entity. There are two methods for de-identification. The first is the “Safe Harbor Method,” in which 18 categories of direct and quasi-identifiers have been removed and the covered entity does not have actual knowledge that the remaining information could be used alone or in combination with other information to identify an individual who is a subject of the information. The second is the “Expert Determination Method,” in which a statistical expert determines and documents that “the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information.” De-identification is often a means for creating large data sets for research and analytics. The primary challenges with de-identification are:

  1. The use of the Safe Harbor method may substantially deteriorate the value of the data because of loss of dates more specific than year, loss of most geographic data about individuals more specific than state, and loss of unique identifiers that allow for data linkages across data sets.
  2. Engaging a statistical expert under the Expert Determination can be expensive and time consuming, and the expert’s determination may last for a limited duration.
  3. De-identification of unstructured data is often difficult, with identifiers potentially slipping through.
  4. Business associates often have access to the large quantities of health data spanning multiple covered entities, but such covered entities may not grant the business associates with de-identification rights.

Another option under the Privacy Rule for creating large data sets is through the creation of a limited data set that may be used or disclosed pursuant to a data use agreement. Some benefits of limited data sets are that they permit the use of PHI for research without the need for an authorization or an IRB or privacy board’s waiver of authorization and, unlike de-identified data, they can include dates, more specific geographic information such as zip codes of individuals, and unique identifiers that may allow for data linkages. Some of the challenges of limited data sets, however, are that they are limited to healthcare operations, public health, and research (with questions surrounding the scope of what constitutes “research”); they are subject to additional Privacy Rule restrictions such as the prohibition on selling PHI; and covered entities rarely give business associates the right to create, use, and disclose limited data sets.

The Privacy Rule also includes a few general limitations that often hinder disclosures that contribute to compilation of health data. One is the “minimum necessary” standard, under which a covered entity or business associate must make reasonable efforts to limit the amount of PHI disclosed to the minimum necessary to accomplish the intended purpose of the disclosure. Parties involved in creating data sets may disagree on what constitutes the minimum necessary PHI, and the covered entity may need to expend resources to redact some PHI that is unnecessary. Another limitation is the Privacy Rule’s prohibition on the sale of PHI. Even if a disclosure is otherwise permissible under the Privacy Rule, a covered entity or business associate generally cannot receive remuneration (including non-financial remuneration like services or intellectual property rights) beyond the cost of preparing and transmitting the protected health information.

Finally, another challenge with the Privacy Rule is that it often serves as a convenient excuse to not share data. As articulated above, the Privacy Rule includes various permissions that covered entities potentially can utilize to disclose PHI, contributing to big data analytics. But doing so involves navigating various Privacy Rule criteria, taking on the legal risk if a regulator disagrees with the covered entity’s interpretation of the Privacy Rule, and potential reputational harm associated with disclosing PHI for purposes unrelated to treatment. As a result, covered entities often default to stating that the Privacy Rule does not permit a disclosure of PHI, rather than going through the analysis and risk of diving deeper and navigating the Privacy Rule.

The Part 2 Rule

While the Privacy Rule is generally treated as the primary federal law governing health information, 42 C.F.R. Part 2 (the Part 2 Rule) has actually been around for decades longer. The Part 2 Rule governs certain federally assisted “Part 2 programs” that provide substance use disorder (SUD) services. A “program” is: (1) an individual or entity, other than a general medical facility, that holds itself out as providing and provides SUD services; (2) an identified unit within a general medical facility that holds itself out as providing and provides SUD services; or (3) medical personnel or other staff in a general medical facility whose primary function is the provision of SUD services and who are identified as such a provider. What constitutes “federal assistance” is broadly defined to include tax-exempt status, reimbursement from federal health insurance programs, or federal licensure, certification, or registration. The Part 2 Rule governs individually identifiable records that originate from a Part 2 program and identify a patient as having or having had a SUD.

The Part 2 Rule includes more stringent requirements on SUD records than HIPAA, potentially curtailing the disclosure of such records for data analytics. The most relevant permissions under the Part 2 Rule are: (1) the Part 2 program or a “lawful holder” of Part 2 records may de-identify the information in the same manner as under HIPAA; or (2) the Part 2 Rule permits disclosures of SUD records for research without patient consent if: (i) the recipient is subject to HIPAA, the Common Rule governing protection of human subject (45 C.F.R. part 46), or Food and Drug Administration regulations regarding human subjects (21 C.F.R. parts 50 and 56) or (ii) the disclosing Part 2 program or lawful holder of the records is subject to and complies with the Privacy Rule with respect to the disclosure. Absent those options, a Part 2 program or lawful holder of Part 2 records may need to exclude such SUD records from any disclosure of health data. Accordingly, conducting big data analytics involving SUDs can be particularly challenging.

Section 5 of the FTC Act

Section 5 of the FTC Act prohibits unfair and deceptive trade practices. The FTC interprets this as prohibiting practices that violate the privacy of individuals’ personal information. The FTC has brought enforcement actions against healthcare entities for violating their online privacy policies or other privacy statements, treating such practices as deceptive. The FTC also has brought enforcement actions where it alleged a privacy practice to be “unfair” because the act or practice causes or is likely to cause substantial injury to consumers that consumers cannot readily avoid themselves and that this not outweighed by countervailing benefits to consumers or competition. The FTC does not have jurisdiction over genuine non-profit entities under Section 5.

Historically, the FTC’s Section 5 enforcement actions in the healthcare space have involved issues such as improper disposal of PHI, allegations of lax security, or alleged disclosure of personal information collected from healthcare websites to third-party advertising platforms. The FTC has not brought any enforcement actions related to large disclosures of health data for big data analytics. Nevertheless, for-profit entities involved in disclosing health data should be very cognizant of the FTC’s authority and ensure that their disclosures are fully consistent with their privacy policies and do not cause any unfair harm to consumers.

The FTC’s Health Breach Notification Rule

Finally, the FTC has a breach notification rule governing personal health records. The FTC refers to this rule as the Health Breach Notification Rule (HBNR). The HBNR was promulgated in 2009. In September 2021, the FTC issued a policy statement clarifying the scope of the HBNR in two particular ways. First, the FTC clarified the scope of “personal health records,” broadly interpreting the definition to encompass nearly any health and wellness application. Second, the FTC clarified that a “breach” for purposes of the HBNR is not limited to external cyber attacks but also includes any use or disclosure of personal health records without users’ authorizations. After this policy statement and over a decade without enforcement of the HBNR, the FTC brought two enforcement actions under the HBNR in 2023.

As the federal government has pushed for increased interoperability and greater access to health data through APIs, there is significant potential to aggregate health data through collection and disclosure by consumers and consumer applications. Pursuant to the FTC’s broad interpretation of the HBNR, however, consumer application developers must ensure that they are transparent about their disclosure practices and have users’ authorizations for disclosures of their health data. Otherwise, the FTC will treat their disclosures as reportable breaches and may impose penalties for failures to notify users of such breaches.

State Privacy Laws

In addition to federal laws governing privacy of health data, each state has its own unique set of privacy laws that potentially limit the disclosure and aggregation of health data. These laws can generally fall under five categories.

First, a small but growing number of states have enacted comprehensive data privacy laws. The lead example of this type of law is the California Consumer Privacy Act (CCPA). Since California enacted CCPA, other states, such as Colorado, Connecticut, Indiana, Iowa, Utah, and Virginia, have similarly passed comprehensive privacy laws. These state laws also include exemptions for certain types of health information, such as PHI governed by HIPAA. But the details of these exemptions can vary. For example, the Utah Consumer Privacy Act exempts PHI generally, while CCPA only exempts PHI that is collected by a covered entity or business associate. These laws can impact the disclosure of health data, especially when such health data is outside of the governance of HIPAA. Additionally, CCPA is unique in requiring certain contractual provisions for the sale or licensing of de-identified health data that originated from PHI.

Second, many states have medical privacy laws governing the disclosure (and sometimes use) of health information. An example of this is the California Confidentiality of Medical Information Act. These laws may be more stringent than HIPAA with respect to limiting the extent that healthcare providers may disclose health data for purposes of data analytics in areas such as research. Some of the state laws may extend beyond healthcare providers, potentially governing recipients of health data.

Third, almost every state has laws governing certain sensitive conditions or treatments, such as, for example, HIV status, genetic information, and alcohol and drug abuse information. These laws tend to require authorizations and, unlike HIPAA and the more general state medical privacy laws, usually do not include many exceptions to patient authorization requirements.

Fourth, every state has a breach notification law. These laws do not address when health data may be disclosed, but may require breach notifications to affected individuals and regulators if an entity finds that health data was improperly disclosed.

Finally, Washington state recently passed a rather unique law, the My Health My Data Act. This law provides numerous privacy rights to consumers with respect to “consumer health data,” which is broadly defined. PHI is exempt. Generally, a consumer’s consent is required to collect or disclose consumer health data for purposes that are not necessary to provide a product or service that the consumer has requested.

This complex patchwork of state privacy laws constitutes a substantial challenge to amassing large quantities of health data for research and analytics. Entities must review each state’s laws and may need to exclude certain categories of identifiable health data for which authorizations are required.

Big Data Projects

The vast majority of entities in the healthcare sector that own robust data sets (data owners) are now or will soon face questions and projects related to the research and commercialization of health data, much of it identifiable to specific individuals, which may include consumers of all types, such as patients, their family members, beneficiaries, research subjects, employees, and others. Many of these questions and projects related to important human subjects research, healthcare sector patient care and utilization improvements, unique research and development (R&D), workforce efficiencies, population health, and many other critical healthcare issues. However, all of these questions and projects create significant legal risk for the data owners involved, which may also be related to their vendors or cause liability for their vendors, as well.

Data Privacy and Security Violations and Enforcement

The HHS Office for Civil Rights, FTC, and State Attorneys General all have (and have exercised) authority to enforce privacy rules and promises made to individual consumers (data subjects), including as discussed above. These regulators settle cases on a regular basis related to breaches involving identifiable consumer information and allegations of violations of data privacy and security requirements at the state and federal levels related to advertising, marketing, sale, and other improper or prohibited use and disclosure of the identifiable information of data subjects. For example, both state and federal regulators take the position that the disclosure of identifiable consumer information without advance written consent in exchange for any direct remuneration (money) or indirect remuneration (other goods and services, including favorable license terms and website analytics) constitutes a sale of such identifiable information, which is prohibited by state and federal law. As such, entities developing these types of projects should be acutely aware of any risks involved with using or disclosing identifiable consumer information as part of such projects.

Upstream Contractual Breaches

Most contracts drafted or revised in the last few months or years contain provisions addressing allowable uses and disclosures of identifiable information, along with licenses related to such information from the data owners. For example, in the healthcare sector, most health insurance payor program participation contracts include significant limitations on data aggregation and de-identification or anonymization of identifiable information of beneficiaries covered by such contracts. As such, downstream secondary uses and disclosures of data can run afoul of upstream contractual limitations leading to disputes and damages both pursuant to the contracts at issue, as well as due to breaches of such information or other regulatory inquiries related to such information.

Increased Risk of Data Breach

Given that these questions and projects usually involve large data sets or other sources of aggregated information, such as “data lakes” or “data warehouses,” the entities maintaining these sources of aggregated information may become targets for both insider and external threat actors or may be vulnerable in other ways. For example, information used for all of the types of projects discussed above is extremely valuable and may be stolen by criminals in a variety of ways. Additionally, given that many of the projects utilizing these sources of information involve emerging technologies and R&D related to all types of software applications and devices, often human error as part of development or implementation of such projects results in data privacy and security incidents. Any privacy or security incident could result in exposure or theft of all of the aggregated information involved in any project, with serious consequences pursuant to state and federal data breach requirements, as discussed above, and to class action lawsuits pursuant to all types of state law claims associated with consumer protections and data breach requirements.

Reputational Damage

Would your grandmother approve? While many entities undertake projects involving consumer identifiable information or anonymized information after rigorous legal analysis, some of these projects may still look improper to an outsider, including a consumer whose information may be involved in the project, a consumer who does not understand the scope or purposes of the project, or a consumer who believes their consent should be required for any uses or disclosures of their information, identifiable or otherwise. At the end of the day, even where no law or regulation is violated, the court of public opinion may frown upon questionable or unethical uses and disclosures of consumer information, both identifiable and anonymized.


Again, the bigger the data, the higher the risks. New interoperability tools create exciting new opportunities for aggregation and analysis of big data sets that can lead to unprecedented improvements in healthcare. Attorneys negotiating or reviewing transactions involving healthcare data, however, must be familiar not only with business goals and project details, but also with a complex patchwork of federal and state privacy laws governing the data involved. While these complex laws can be navigated, doing so requires carefully analyzing the interaction of business needs related to a continuously growing number of federal and state laws. Given all of these issues, a team approach to these projects, questions, and answers may result in the best outcome for all.

    Adam Greene

    Davis Wright Tremaine LLP, Washington, DC

    Adam Greene is a partner in the Technology, Communications, Privacy & Security Practice Group at Davis Wright Tremaine LLP. His practice focuses on health information privacy, security, breach notification, and information blocking, and he is a frequent speaker and writer on these topics. Prior to joining Davis Wright Tremaine LLP, Mr. Greene was a regulator at the U.S. Department of Health and Human Services, helping HHS to bring the first HIPAA enforcement actions, implement the HITECH Act’s privacy provisions, and apply HIPAA to health information technology. He can be reached at [email protected].

    Iliana Peters

    Polsinelli LLP, Washington, DC

    Iliana Peters is a shareholder at Polsinelli LLP in Washington, DC. She works closely with her clients on complicated compliance questions, incident response, investigations, and training to protect data and avoid legal risk and legal liability, both at the state and federal levels. Ms. Peters also supports clients’ defense of individual and class action litigation related to all types of data privacy, security, and breach claims. She can be reached at [email protected].

    The material in all ABA publications is copyrighted and may be reprinted by permission only. Request reprint permission here.