Baselight
Sign In
kaggle

Loan Approval Dataset

Kaggle
โ€ข

@kaggle.rohitgrewal_loan_approval_dataset

Loading...
Loading...

Building a Random Forest ML Model to Predict Loan Approval

Dataset Description

๐Ÿ“น ML Project 2 - Loan Approval Prediction End-to-End Machine Learning Project with Python, on YouTube - https://youtu.be/pPZ3G0vz23o

๐Ÿ–‡๏ธ Enroll in our Udemy course "Python Data Analytics Projects" - https://www.udemy.com/course/bigdata-analysis-python/?referralCode=F75B5F25D61BD4E5F161


Loan Approval Dataset of Applicants

This dataset represents real-world loan application data used to analyze customer profiles and predict whether a loan will be approved or rejected.
Each row represents one applicant, and the columns include profile, financial, and asset-related information.

The dataset is suitable for exploratory data analysis (EDA), data cleaning, visualization, and business insights generation using Python.

This data is available as a CSV file. We are going to analyze this data set using the Pandas DataFrame.


These are the main Features/Columns available in the dataset :

  1. No_of_dependents โ€“ Number of dependents the applicant has

  2. Education โ€“ Graduate / Not Graduate

  3. Self_employed โ€“ Whether the applicant is self-employed

  4. Income_annum โ€“ Annual income of the applicant

  5. Loan_amount โ€“ Loan amount requested

  6. Loan_term โ€“ Loan repayment time (years)

  7. Cibil_score โ€“ Credit score indicating creditworthiness

  8. Residential_assets_value โ€“ Value of residential assets

  9. Commercial_assets_value โ€“ Value of commercial assets

  10. Luxury_assets_value โ€“ Value of luxury assets

  11. Bank_asset_value โ€“ Bank balance / financial assets

  12. Loan_status (Target variable) โ€“ Approved (1) or Rejected (0)


Using this dataset, we answered multiple questions and built a Random Forest Model with Python in our Project.

๐Ÿ”ฅ Section 1: Data Understanding (Warm-up)

๐Ÿ“Š Section 2: Exploratory Data Analysis (EDA)

1. Does Employment Type affect loan approval?

2. Do graduates get higher approval rates than non-graduates?

3. Does having more dependents reduce approval chances?

4. What is the relationship between applicant income and loan approval?

5. How does loan amount vary between approved and rejected loans?

6. Does Cibil Score strongly influence loan approval?

๐Ÿ“ˆ Section 3: Data Visualization

A. Compare Applicant Income vs Loan Amount (scatter plot)

B. Show correlation heatmap of numerical features

๐Ÿงน Section 4: Data Cleaning & Feature Engineering

Detect Outliers

Remove Outliers

Which features should be dropped (like Loan_ID)?

Convert categorical variables into numeric using Maping, Label Encoding & One Hot Encoding?

Mapping
Label Encoding
One-Hot Encoding

Scale numerical features before model training

Model Building

Model Prediction

Save Model


Enroll in our Udemy courses :

  1. Python Data Analytics Projects - https://www.udemy.com/course/bigdata-analysis-python/?referralCode=F75B5F25D61BD4E5F161
  2. Python For Data Science - https://www.udemy.com/course/python-for-data-science-real-time-exercises/?referralCode=9C91F0B8A3F0EB67FE67
  3. Numpy For Data Science - https://www.udemy.com/course/python-numpy-exercises/?referralCode=FF9EDB87794FED46CBDF


Related Datasets

Share link

Anyone who has the link will be able to view this.