Baselight

Credit Card Fraud

Python analysis applied to Fraud detection

@kaggle.oscaryezfeijo_credit_card_fraud

About this Dataset

Credit Card Fraud

Credit Card Fraud Detection
Introduction
Credit card fraud detection is a critical challenge in the financial sector. This project aims to build a machine learning model to identify fraudulent credit card transactions using a comprehensive dataset.

Dataset Overview
The dataset contains transactions made by credit cards in September 2013 by European cardholders. It presents a significant class imbalance, with the majority of transactions being non-fraudulent.

Features:

Time: Seconds elapsed between this transaction and the first transaction in the dataset.
V1 to V28: Anonymized features resulting from a PCA transformation.
Amount: Transaction amount.
Class: Target variable (1 for fraud, 0 for non-fraud).
Steps Taken

  1. Data Preprocessing
    Standardization: Standardized numeric features to improve model performance.
    Handling Imbalance: Applied SMOTE (Synthetic Minority Over-sampling Technique) to balance the dataset and ensure the model is well-trained on both classes.
  2. Exploratory Data Analysis
    Correlation Analysis: Examined correlations between features to understand relationships and their potential impact on the model.
  3. Model Building
    Algorithm Used: Random Forest Classifier, chosen for its robustness and high performance.
    Hyperparameter Tuning: Employed RandomizedSearchCV to find the best hyperparameters and enhance model accuracy.
  4. Model Evaluation
    Confusion Matrix & Classification Report: Evaluated the model’s performance using key metrics such as precision, recall, F1-score, and overall accuracy.
    Feature Importance: Analyzed feature importances to identify which features contribute most to detecting fraud.
    Results
    The model achieved outstanding performance metrics:

Accuracy: 100%
Precision, Recall, F1-score: 1.00 for both classes
Confusion Matrix:
True Negatives (TN): 9906
False Positives (FP): 8
False Negatives (FN): 9
True Positives (TP): 9757
Conclusion
This project demonstrates the effectiveness of machine learning in detecting fraudulent credit card transactions. The key steps, including data preprocessing, handling class imbalance, and hyperparameter tuning, were crucial in achieving high model performance. The feature importance analysis provided valuable insights into the key indicators of fraudulent activity.

Check out the full code and detailed analysis in the GitHub Repository.

Share link

Anyone who has the link will be able to view this.