This dataset collects financial filings from various companies submitted to the U.S. Securities and Exchange Commission (SEC). The dataset consists of 85 companies involved in fraudulent cases and an equal number of companies not involved in fraudulent activities. The Fillings column includes information such as the company's MD&A, and financial statement over the years the company stated on the SEC website.
This dataset was used for research in detecting financial fraud using multiple LLMs and traditional machine-learning models.
For detailed report and code please follow this repo: https://github.com/amitkedia007/Financial-Fraud-Detection-Using-LLMs/