chevron-down Created with Sketch Beta.

ARTICLE

Is it Time for the Government's "Moneyball" Moment? A May 2023 Government Accountability Office Report Highlights Data Mining and Matching to Identify Potential New Fraud Claims

Jackson Hobbs

Summary

  • The GAO’s “Fraud Schemes and Indicators in SBA Pandemic Programs” report identified six primary categories of potential fraud indicators based on its review of the Paycheck Protection Program (PPP) and COVID-19 Economic Injury Disaster Loan (EIDL) data.
  • In addition to criminal charges, the DOJ can pursue civil remedies for suspected fraud under the False Claims Act and the Financial Institutions Reform, Recovery, and Enforcement Act of 1989.
  • The SBA has indicated a desire to obtain data from external sources to proactively find and investigate fraud indicators in response to GAO recommendations.
Is it Time for the Government's "Moneyball" Moment? A May 2023 Government Accountability Office Report Highlights Data Mining and Matching to Identify Potential New Fraud Claims
EHStock via Getty Images

The 2011 Oscar-winning film, Moneyball, is a docudrama chronicling a small-market baseball team’s use of data-driven analytics to build a championship-caliber contender. Brad Pitt plays the team’s general manager. Jonah Hill plays the team’s leading data guru, who explained his use of data to find players that other teams overlooked like this:

It's about getting things down to one number. Using stats to reread them, we'll find the value of players that nobody else can see. People are overlooked for a variety of biased reasons and perceived flaws. Age, appearance, personality. Bill James and mathematics cuts straight through that. Billy, of the twenty thousand knowable players for us to consider, I believe that there is a championship team of twenty-five people that we can afford. Because everyone else in baseball under values them. Like an island of misfit toys.

While the team ultimately came up short at the end of Moneyball, the success of the team built on sabermetrics demonstrated a new era of baseball had begun.

But what do sabermetrics and a Brad Pitt movie have in common with emerging trends in the world of white-collar governmental enforcement?

A recent report published in May 2023 provides the answer.

In the wake of the COVID-19 Pandemic, the Small Business Administration (“SBA”) quickly developed and launched programs to relieve financially strained small businesses through various CARES Act programs. Four particular programs included the Paycheck Protection Program; the COVID-19 Economic Injury Disaster Loan, the Restaurant Revitalization Fund, and the Shuttered Venue Operator’s Grant.

After providing more than $1 trillion to more than 10 million small businesses, the Government Accountability Office (“GAO”) published its observations on “Fraud Schemes and Indicators in SBA Pandemic Programs,” in May 2023.

The report identified potential fraud indicators and made recommendations to detect and prosecute fraud arising from these programs. The report:

(1) Analyzes Fraud Cases charged by DOJ involving PPP and COVID-19 EIDL to understand fraud schemes and impacts;
(2) Provides Data-Driven Results of select statistical analyses regarding fraud indicators in PPP and COVID-19 EIDL; and
(3) Identifies Opportunities for SBA to enhance its data analytics to better detect fraud moving forward.

While COVID relief program fraud is not a new topic—especially amongst white-collar practitioners—the GAO’s recent report discusses trends and areas of emphasis in fraud enforcement and demonstrates the increasing use of data analytics tools that GAO recommends using to detect fraud cases in the future.

Data Mining and Matching for Fraud Indicators: The Sabermetrics of Fraud Detection

After analyzing roughly 13.4 million recipients of PPP and COVID-19 EIDL applicants, the GAO found more than 3.7 million unique recipients contained fraud indicators through a process of data mining and matching. Data mining analyzes data for relationships that have not previously been discovered. Data matching is a process in which information from one source is compared with information from another, such as government or third-party databases, to identify any inconsistencies. Based on its review of PPP and COVID-19 EIDL data, the GAO identified six primary categories of potential fraud indicators:

(1) No Wage Data: More than 2 million unique recipients claimed employees on applications but failed to submit wage data to the National Directory of New Hires to confirm those employees existed.
(2) Different Employee Totals: More than 290,000 unique recipients claimed a different number of employees on its application as compared to the actual number of employees reported to the National Directory of New Hires.
(3) Different Payroll Costs: More than 440,000 unique recipients claimed approved loan amounts based on reported payroll costs that exceeded loan amounts based on actual wages paid, as reported to National Directory of New Hires.
(4) Received Multiple Loans: 22,000 unique recipients received more approved loans than permitted under the programs.
(5) Reused Information: More than 890,000 unique recipients claimed to be different or unique recipients but were using the same underlying business information, like business names and addresses.
(6) Provided Different Information to Each Program: More than 380,000 unique recipients participated in both the PPP and COVID-19 EIDL programs but submitted different underlying business information—including business type and organizational structure—to each program.

As the name suggests, these fraud indicators, can show, but do not definitively prove fraud. Additional review, investigation, and adjudication is needed to determine if fraud exists. Nonetheless, these are important markers for recipients of CARES Act program funds to ensure adherence.

In addition to these primary categories of fraud indicators, the report found additional variables that tended to show an association for increased likelihood of potential fraud. For example:

(1) Business Ownership Type: Loans for sole proprietor businesses are more likely to be identified in a fraud case, compared to businesses that are not sole proprietorships;
(2) Lender Type: Loans authorized through nonbank lenders are more likely to be identified in a fraud case, compared to bank lenders; and
(3) Asset Size of the Lender: Loans authorized through Medium sized lenders are more likely to be identified in a fraud case, compared to large lenders.

As of December 2021, the DOJ had charged 169 cases of PPP fraud, 70 cases of COVID-19 EIDL fraud, and 91 cases involving fraud in both programs. Of these cases, 221—roughly 67 percent—have involved non-operating businesses, which were identified to be shell companies or fictitious entities and thus are ineligible for PPP and COVID-19 EIDL programs.

But the entities are not the only ones facing charges—the applicants themselves are also facing criminal charges. The DOJ has charged people and businesses with: misrepresenting eligibility, falsifying documents, using stolen identities, and deliberately exploiting the programs by conspiring with each other, sharing knowledge on how to circumvent controls, and obtaining kickbacks. In addition, the DOJ can pursue civil remedies for suspected fraud under the False Claims Act, 31 U.S.C. § 3729-3733 and the Financial Institutions Reform, Recovery, and Enforcement Act of 1989, 12 U.S.C. § 1833a.

Based at least in part on these fraud indicators and cases charged so far, the GAO makes recommendations to continue to detect fraud from these programs, including that the SBA:

(1) Continue to Mix and Match Data: The first recommendation is to utilize cross-program resources and data to catch recipients who applied to multiple relief programs; and
(2) Obtain More Data to Increase Comparisons: The second recommendation is to identify additional data resources to aid in fraud detection and prevention, including methods for the SBA to access those resources.

As part of these recommendations, the SBA has indicated a desire to obtain data from a broader set of external sources, including data held by the Internal Revenue Service and Social Security Administration to proactively find and investigate fraud indicators. By the end of 2023, the SBA plans to have its data analytics program for mining and matching indicators of fraud fully in place.

Although our nation’s COVID cases are on a decline, investigations into COVID relief fraud continue to grow. This is especially true considering the statute of limitations for PPP and COVID-19 EIDL fraud was extended from 5 to 10 years. As of January 25, 2023, the SBA Office of the Inspector General had an additional 536 ongoing investigations for PPP, COVID-19 EIDL or both. Johana Ayers, the Managing Director of the Forensic Audits and Investigative Services team at GAO explained the SBA's own data analysis related to PPP and COVID-19. “The number of cases being investigated grows every day, as do the number of cases making their way through the judicial system. And that will continue on for many more years.”

As COVID fraud enforcement progresses, identifying fraud indicators through data mining and matching will provide a useful tool for government enforcement. Ayers explains this matching and mining process, which involves analyzing SBA’s own data related to PPP and COVID-19 EIDL funding, and then matching that data to a data source that the Forensic Audits and Investigative Services team has access to, but that SBA does not, to identify and pursue fraud indicators.

Individuals and businesses should vigilantly confirm the documentation provided to various government agencies is correct, and update and explain why, if-and-when data changes. As this recent GAO report indicates and recommends, analyzing data and identifying fraud indicators is the way of the future in fraud enforcement. While the sabermetrics of fraud detection appear to be in their infancy, this recent GAO report indicates that this data-driven approach to fraud detection and prevention is here to stay. Now, it’s time to play ball.