An Anonymized Dataset for Credit Card Fraud Detection and Transaction Analysis

The dataset consists of 284,807 credit card transactions made by European cardholders over a period of two days. Each transaction is described using 31 features, most of which have been transformed through Principal Component Analysis (PCA) to ensure data privacy and confidentiality.
The features labeled V1 to V28 represent these PCA-transformed components and capture underlying transaction patterns without exposing sensitive financial information. In addition to these features, the dataset includes Time, which indicates the elapsed time between consecutive transactions, and Amount, which represents the monetary value of each transaction.
This dataset contains real-world credit card transaction data collected for the purpose of detecting fraudulent activities. It is a large-scale, clean, and highly imbalanced dataset commonly used for binary classification problems in fraud detection. The main objective is to accurately distinguish between legitimate and fraudulent transactions using anonymized numerical features.

Real-world credit card transaction data
Designed for fraud detection (binary classification)
Highly imbalanced target variable
Clean dataset with no missing values
Widely used for machine learning benchmarking

This dataset contains real-world credit card transaction data collected for the purpose of detecting fraudulent activities. It is a large-scale, clean, and highly imbalanced dataset commonly used for binary classification problems in fraud detection. The main objective is to accurately distinguish between legitimate and fraudulent transactions using anonymized numerical features.
The target column, Class, identifies whether a transaction is legitimate (0) or fraudulent (1). Fraudulent transactions are extremely rare, accounting for a very small percentage of the total data, which makes this dataset particularly valuable for studying class imbalance, anomaly detection, and precision-recall optimization in machine learning models.
Because of its realistic structure, privacy-preserving features, and challenging imbalance nature, this dataset is extensively used in academic research, industry projects, and data science competitions.

Related Datasets

Credit Card Dataset

@kaggle
PAY

@ecb
Fur Banning

@owid
PCN

@ecb
PIS

@ecb
BDC Counter-Fraud Work

@ukgov

Credit Card Dataset

PAY

Fur Banning

PCN

PIS

BDC Counter-Fraud Work