Simulated Employee Data for the IT Industry
Dataset Description
The dataset, named employee_data.csv, contains simulated data of 400 employees working in various IT-related positions. The data includes details about each employee's gender, years of experience, position, and salary. The dataset aims to reflect realistic distributions and variations within the IT industry, particularly how salaries tend to increase with experience and the specific job role. This dataset was generated using the Faker library in Python, which allows for the creation of realistic fake data for various applications.
- ID: A unique identifier for each employee (1 to 400).
- Gender: The gender of the employee. The values are either 'M' (Male) or 'F' (Female).
- Experience (Years): The number of years of professional experience the employee has, ranging from 0 to 20 years.
- Position: The job title of the employee. The positions included in the dataset are:
- IT Manager
- Software Engineer
- Network Administrator
- Systems Administrator
- Database Administrator (DBA)
- Web Developer
- IT Support Specialist
- Systems Analyst
- IT Security Analyst
- DevOps Engineer
- Cloud Solutions Architect
- Salary: The annual salary of the employee in USD. The salary is generated to reflect realistic compensation within the IT industry and increases with both the position and years of experience.
Sample Data:
| ID | Gender | Experience (Years) | Position | Salary |
|---|---|---|---|---|
| 1 | M | 5 | Software Engineer | 84,000 |
| 2 | F | 10 | IT Manager | 135,000 |
| 3 | M | 7 | Network Administrator | 85,000 |
| 4 | F | 15 | Cloud Solutions Architect | 147,000 |
| 5 | M | 2 | Web Developer | 60,000 |
Applications
The dataset can be used for various purposes, including:
- Data Analysis: Analyzing salary trends based on position and experience.
- Machine Learning: Training models for salary prediction.
- Human Resources: Understanding compensation structures in the IT industry.
- Education: Teaching purposes in data science and data analysis courses.
Related Datasets
-
Employee Attrition Data
@kaggle
-
AI Index Report (2022)
@owid