Baselight

Employee Data Simulation: IT Industry

Simulated Employee Data for the IT Industry

@kaggle.abhayayare_employee_data_simulation_it_industry

About this Dataset

Employee Data Simulation: IT Industry

The dataset, named employee_data.csv, contains simulated data of 400 employees working in various IT-related positions. The data includes details about each employee's gender, years of experience, position, and salary. The dataset aims to reflect realistic distributions and variations within the IT industry, particularly how salaries tend to increase with experience and the specific job role. This dataset was generated using the Faker library in Python, which allows for the creation of realistic fake data for various applications.

  1. ID: A unique identifier for each employee (1 to 400).
  2. Gender: The gender of the employee. The values are either 'M' (Male) or 'F' (Female).
  3. Experience (Years): The number of years of professional experience the employee has, ranging from 0 to 20 years.
  4. Position: The job title of the employee. The positions included in the dataset are:
  • IT Manager
  • Software Engineer
  • Network Administrator
  • Systems Administrator
  • Database Administrator (DBA)
  • Web Developer
  • IT Support Specialist
  • Systems Analyst
  • IT Security Analyst
  • DevOps Engineer
  • Cloud Solutions Architect
  1. Salary: The annual salary of the employee in USD. The salary is generated to reflect realistic compensation within the IT industry and increases with both the position and years of experience.

Sample Data:

ID Gender Experience (Years) Position Salary
1 M 5 Software Engineer 84,000
2 F 10 IT Manager 135,000
3 M 7 Network Administrator 85,000
4 F 15 Cloud Solutions Architect 147,000
5 M 2 Web Developer 60,000

Applications

The dataset can be used for various purposes, including:

  • Data Analysis: Analyzing salary trends based on position and experience.
  • Machine Learning: Training models for salary prediction.
  • Human Resources: Understanding compensation structures in the IT industry.
  • Education: Teaching purposes in data science and data analysis courses.