Human Resource (HR) Data of a Multi-national Corporation (MNC)
This dataset contains HR information for employees of a multinational corporation (MNC). It includes 2 Million (20 Lakhs) employee records with details about personal identifiers, job-related attributes, performance, employment status, and salary information.
The dataset can be used for HR analytics, including workforce distribution, attrition analysis, salary trends, and performance evaluation.
This data is available as a CSV file. We are going to analyse this data set using the Pandas.
This analyse will be helpful for those working in HR domain.
Using this dataset, we answered multiple questions with Python in our Project.
Q.1) What is the distribution of Employee Status (Active, Resigned, Retired, Terminated) ?
Q.2) What is the distribution of work modes (On-site, Remote) ?
Q.3) How many employees are there in each department ?
Q.4) What is the average salary by Department ?
Q.5) Which job title has the highest average salary ?
Q.6) What is the average salary in different Departments based on Job Title ?
Q.7) How many employees Resigned & Terminated in each department ?
Q.8) How does salary vary with years of experience ?
Q.9) What is the average performance rating by department ?
Q.10) Which Country have the highest concentration of employees ?
Q.11) Is there a correlation between performance rating and salary ?
Q.12) How has the number of hires changed over time (per year) ?
Q.13) Compare salaries of Remote vs. On-site employees — is there a significant difference ?
Q.14) Find the top 10 employees with the highest salary in each department.
Q.15) Identify departments with the highest attrition rate (Resigned %).
Enrol in our Udemy courses :
- Python Data Analytics Projects - https://www.udemy.com/course/bigdata-analysis-python/?referralCode=F75B5F25D61BD4E5F161
- Python For Data Science - https://www.udemy.com/course/python-for-data-science-real-time-exercises/?referralCode=9C91F0B8A3F0EB67FE67
- Numpy For Data Science - https://www.udemy.com/course/python-numpy-exercises/?referralCode=FF9EDB87794FED46CBDF
These are the main Features/Columns available in the dataset :
-
Unnamed: 0 – Index column (auto-generated, not useful for analysis, will be deleted).
-
Employee_ID – Unique identifier assigned to each employee (e.g., EMP0000001).
-
Full_Name – Full name of the employee.
-
Department – Department in which the employee works (e.g., IT, HR, Marketing, Operations).
-
Job_Title – Designation or role of the employee (e.g., Software Engineer, HR Manager).
-
Hire_Date – The date when the employee was hired by the company.
-
Location – Geographical location of the employee (city, country).
-
Performance_Rating – Performance evaluation score (numeric scale, higher is better).
-
Experience_Years – Number of years of professional experience the employee has.
-
Status – Current employment status (e.g., Active, Resigned).
-
Work_Mode – Mode of working (e.g., On-site, Hybrid, Remote).
-
Salary_INR – Annual salary of the employee in Indian Rupees.