chevron-down Created with Sketch Beta.

Law Practice Magazine

The Marketing Issue

Prudently Training AI on Client Confidential Information

Lucian T Pera


  • Guidance on how lawyers may (and may not) use client confidential and other information from their practices to train AI tools.
  • Lawyers and law firms are beginning to develop and train AI tools, for their own use and perhaps use by clients and others.
  • AI tools need training data and lawyers may, under certain circumstances, be able to use client confidential other information to train these AI tools, if proper safeguards and use limitations are established.
Prudently Training AI on Client Confidential Information

Jump to:

Let’s take a step firmly on to the frontier of legal ethics and artificial intelligence (AI).

Only the willfully ignorant are today unaware that AI is The Next Big Thing, even for the law business. Will it be revolutionary or evolutionary? Will it replace lawyers, or at least some lawyers? Will it improve access to justice? It is too soon to tell, of course.

Today, let’s jump beyond the speculation, and perhaps beyond 2024’s shiny objects, to a more practical question: Can lawyers ethically use AI tools to leverage the vast troves of client confidential information in their own files to better serve their own clients? Can I use an AI tool as a “second brain” to better learn all there is to be learned from information in my own files, while still respecting my ethical obligation of confidentiality?

The Basic Ethics Rules

A legion of commentators has already offered guidance on the legal ethics implications of AI. A vast array of our ethics rules clearly applies to our use of AI tools, including ABA Model Rules of Professional Conduct on competence (Rule 1.1), diligence (Rule 1.3), client communications (Rule 1.4), confidentiality (Rules 1.6 and 1.9), supervision (Rules 5.1 and 5.3), and unauthorized practice (Rule 5.5). Today we drill down—and perhaps forward—on client confidentiality.

AI Confidentiality Basics

If you’ve been paying attention, you are aware of confidentiality concerns about inputting any client confidential information into any free or public AI service. Since the dawn of the internet, using free services usually has meant that the provider has no obligation to keep what you tell it confidential, while paid services sometimes undertake some obligations like this. (Think free Gmail versus a paid account. And do please read the terms of services.)

Free and public AI services are “worse,” and sometimes more dangerous. Not only do they often lack any confidentiality obligation to the user, but they also sometimes actually “train on”—meaning retain and use, and sometimes then disclose to other users—information submitted to them. All of this makes it very dicey to submit any client confidential information to such services.

But increasingly there are other types of AI tools, including tools whose providers do undertake confidentiality obligations to users, promising not to “train” on or share your submitted data. The ethics rules require serious care when vetting AI tools. None of which is really our topic, but worth considering.

Roll Your Own

Services are now becoming available to lawyers and law firms that can be set up to train on, and have access to, only a limited set of carefully curated data, and that will provide output only to a carefully limited set of users.

Suppose your law firm has for years been the go-to firm for lending to a certain type of business, or drunk driving defense, or one type of mass tort claim. In your firm’s client files, you have a huge amount of data on how those matters were handled—common contract terms, sentences received or settlements achieved. Hence, today’s question: To what extent may you, ethically, develop an AI tool that trains on that data? If you can, what can it be used for?

Lots of Client Data

Let’s start with a reminder: as lawyers we are awash in data about our matters, current and old. Whether our client file information is organized by matter in a document management system (DMS), or in a case management system or disorganized on numerous lawyer laptops, we have lots of it. My strong sense is that it has all tended to get more organized, and more accessible, in recent years.

The Broad Sweep of Client Confidentiality

A second key reminder: in pretty much all jurisdictions, virtually all that information is burdened with an obligation under the ethics rules to keep it confidential.

Forty-six jurisdictions currently use the sweeping ABA Model Rule of Professional Conduct 1.6(a) definition that says that all “information relating to the representation of a client” is covered by this obligation, and this even includes information in the lawyer’s possession that comes from public sources. The remaining jurisdictions mostly use a slightly less sweeping definition, but, for our purposes, you might as well assume the sweeping one is in place everywhere.

Use or Disclose?

Generally, the Rules speak in terms of “use” versus “disclosure.” Broadly put, disclosure is prohibited unless that disclosure is “impliedly authorized”—a client impliedly authorizes a lawyer to disclose in a pleading that she has had a bad motor vehicle accident when she authorizes the lawyer to sue the other driver—or unless the client gives informed consent to the disclosure.

But lawyers are permitted to use client confidential information, without any disclosure, so long as that use is not to the disadvantage of the client. For example, a lawyer may have knowledge that, in the last five negotiations with one particular bank for a borrower, the bank always insisted on some types of provisions, and never asked for another type of provision. That important knowledge is at the heart of good lawyering, and can be used for a sixth client, unless its use would somehow harm one of the first five.

Not All Information Is Confidential

Likewise, there is also knowledge that is about the law governing a particular type of legal matter, rather than client confidential information. That’s not usually considered client confidential information.

It's also important to know that there is some information broadly concerning client matters that is generally thought to “belong to” to the lawyer or law firm, rather than the client. For example, how long does it take a lawyer or law firm to draft a type of document or how many hours of work it takes to defend the average drunk driving charge. Again, this is not even considered client confidential information.

Exceptions to Exceptions

Of course, there are exceptions. If disclosing some highly specialized legal research, or how long it took to draft documents in a specific type of matter, would itself, even without a client name attached to it, reveal by implication a client’s confidential information, then that disclosure is prohibited without client consent.

With former clients, the only real difference in our confidentiality obligations is that a lawyer may more broadly use client confidential information when it has become “generally known.”

A Little Guidance

With that refresher, is it ethically possible to train an AI on your cache of client information and documents about your specialized lending practice, drunk driving defense, or that mass tort you’re so good at handling?

Yes, maybe so, but think about these guidelines as a starting point:

  1. Protect the data. Your license—and your clients’ confidential information—is on the line, so be very sensitive to who may always have access to the data, and in all forms and formats. During development, does the data ever leave your control? Can the provider ever use the data for any other purpose? What about the tool, once trained—who can use it and for what purpose? If it can be used by anyone outside the firm, what are the risks that any training data may “leak” or be disclosed?
  2. Carefully curate the data to be used. Think seriously about precisely the information on which the AI will be trained. It is well beyond this ethics nerd’s understanding what input will allow the output you want. But it will matter to your ethics obligations—or at least the risk to client confidential information—whether the information you feed into the AI tool is highly sensitive medical information of a client, a publicly known outcome or anonymized data.

    At the other end of the spectrum, what if, while the information you feed the AI includes client confidential information—say, pleadings filed in a particular class of cases—but the process strips out of the resulting document all confidential information, with the purpose of producing a template that can be used in future cases of the same type? In other words, what if you use the AI to turn your old documents into templates? Does that eliminate almost all confidentiality issues? Yes. Any need for client consent for this use? It’s hard to see how.
  3. Anonymizing data is useful but remember that data can be de-anonymized. You don't need to be a data scientist to know that knowledge that someone has become a client of a well-known divorce lawyer implies that they may be having marital problems.

    The Comment to ABA Model Rule 1.6 very clearly allows lawyers to disclose some otherwise confidential client information in an anonymous fashion, for example, by using a hypothetical question to get curbstone advice to assist a client from a lawyer expert in certain matters, but not in a way that discloses the client’s identity, even by implication.
  4. From day one, very seriously consider any intended use and disclosure of the output. Defining the likely output you want, and the likely users, is key, from Day One. If the only use of the tool’s output is within your law firm, and that output will never be disclosed to anyone outside the firm, different concerns may govern than if you want to try to disclose the output outside the firm, for example, to clients.
  5. Establish clear rules or guardrails on the output’s use. Even with a developer’s and law firm’s best efforts, it is still conceivable that an AI tool’s output might inferentially reveal client confidential information, much like a client’s consultation with a well-known divorce lawyer. At this early stage in our development and use of AI tools, great caution is warranted, especially in setting rules for the use of some outputs of AI tools. Perhaps before the firm’s working lawyer can use, or certainly disclose, the tool’s prediction about the value of a particular mass tort claim, some review by a second lawyer might be in order.
  6. Consider your confidentiality obligations outside the ethics rules. Some matters are covered by protective orders and contractual confidentiality obligations (e.g., settlement agreements and nondisclosure agreements).

    Be aware that many corporate clients’ outside counsel guidelines (OCGs) include innocuous-sounding provisions that say that the lawyer may only use the client’s information for work on that client’s matter, which can be a meaningful restriction in this context. This is where sorting out what is client information and what may be legal information, such as research and other work product, may be important. All of which may lead you to consider a new firm policy of pushing back, on the front end, on such overbroad OCG provisions.
  7. Consult a data privacy lawyer. We ethics nerds know nothing about the myriad obligations you have that arise from all sorts of data privacy law, from HIPAA to the California Consumer Privacy Act. There are issues out there.

Life on the frontier has always come with risks and few guarantees. For those interesting in exploring and settling the new AI frontier for lawyers, I recommend thinking carefully about confidentiality.