In 2020 the US Government created the Paycheck Protection Program (PPP) where small and mid-sized businesses could apply for forgivable loans at one percent interest in order to keep employees on the payroll and the business running during pandemic lockdowns. However, it is well established that these programs became targets for fraud as the government was attempting to get the funding to business quickly to mitigate layoffs.
The US Government has since created a website and task force dedicated to program oversight and fraud detection. The three files in this data set are based on PPP oversight data along with research into publicly-available data about companies charged with PPP loan fraud. This is an initial publication and work is continuing on the data sets (along with planned analysis notebooks).
This data set contains three files:
- ppp-over-150k: Records of PPP loans over USD 150k that have been made public by the US Government. This table has also been cleaned and had additional features added, as detailed in the notebook linked below.
- ppp-data-dict: Data dictionary including all created features.
- ppp-fraud-cases: 100 examples of PPP frauds based upon publicly available data from the US Government, public documents, and media reports. This data set was created via manual research and analysis.
All three data sets are created via this notebook.