Dataset: Fraudulent Transaction Detection

About this Dataset

Fraudulent Transaction Detection

The dataset consists of 1.75 million transactions made by considering simulated users through various terminals throughout the period from January 2023 to June 2023. However, the data is highly imbalanced, with only a small percentage (0.1345%) of transactions classified as fraudulent.

Due to the uneven distribution of classes in the dataset, it is more appropriate to evaluate the model's performance using AUPRC rather than confusion matrix accuracy. Confusion matrix accuracy can be misleading in cases of class imbalance.

Tables

Final Transactions

@kaggle.sanskar457_fraud_transaction_detection.final_transactions

46.78 MB
1754155 rows
10 columns


CREATE TABLE final_transactions (
  "unnamed_0" BIGINT,
  "transaction_id" BIGINT,
  "tx_datetime" TIMESTAMP,
  "customer_id" BIGINT,
  "terminal_id" BIGINT,
  "tx_amount" DOUBLE,
  "tx_time_seconds" BIGINT,
  "tx_time_days" BIGINT,
  "tx_fraud" BIGINT,
  "tx_fraud_scenario" BIGINT
);