Baselight

Salary Prediction

Tech job positions and salaries from glassdoor.com

@kaggle.thedevastator_jobs_dataset_from_glassdoor

Loading...
Loading...

About this Dataset

Salary Prediction

Jobs Dataset from Glassdoor

Tech job positions and salaries from glassdoor.com


About this dataset

This dataset contains job postings from Glassdoor.com from 2017 with the following features It can be used to analyze the current trends based on job positions, company size, etc.

How to use the dataset

This dataset contains job postings from Glassdoor.com from 2017, It can be used to analyze salaries based on company size and other information.

Research Ideas

  • Identify which factors most affect data science salaries
  • Determine which states and cities offer the highest paying data science jobs
  • Predict what a data science job posting will pay based on the job description

Acknowledgements

This dataset was scraped from Glassdoor.com by Ramiro Gomez.

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: eda_data.csv

Column name Description
job_id The unique identifier for the job posting (Numeric)
job_state The state where the job is located (String)
same_state A binary indicator of whether the job is in the same state as the person looking at the job (String)
age The age of the person looking at the job (Numeric)
python_yn A binary indicator of whether the person looking at the job knows Python (String)
R_yn A binary indicator of whether the person looking at the job knows R (String)
spark A binary indicator of whether the person looking at the job knows Spark (String)
aws A binary indicator of whether the person looking at the job knows AWS (String)
excel A binary indicator of whether the person looking at the job knows Excel (String)
job_simp A simplified job title (String)
seniority The seniority of the job (String)
desc_len The length of the job description (Numeric)
num_comp The number of competitors for the job (Numeric)

File: glassdoor_jobs.csv

Column name Description
job_id The unique identifier for the job posting (Numeric)

File: salary_data_cleaned.csv

Column name Description
job_state The state where the job is located (String)
same_state A binary indicator of whether the job is in the same state as the person looking at the job (String)
age The age of the person looking at the job (Numeric)
python_yn A binary indicator of whether the person looking at the job knows Python (String)
R_yn A binary indicator of whether the person looking at the job knows R (String)
spark A binary indicator of whether the person looking at the job knows Spark (String)
aws A binary indicator of whether the person looking at the job knows AWS (String)
excel A binary indicator of whether the person looking at the job knows Excel (String)

Tables

Eda Data

@kaggle.thedevastator_jobs_dataset_from_glassdoor.eda_data
  • 1023.3 KB
  • 742 rows
  • 33 columns
Loading...

CREATE TABLE eda_data (
  "unnamed_0" BIGINT,
  "job_title" VARCHAR,
  "salary_estimate" VARCHAR,
  "job_description" VARCHAR,
  "rating" DOUBLE,
  "company_name" VARCHAR,
  "location" VARCHAR,
  "headquarters" VARCHAR,
  "size" VARCHAR,
  "founded" BIGINT,
  "type_of_ownership" VARCHAR,
  "industry" VARCHAR,
  "sector" VARCHAR,
  "revenue" VARCHAR,
  "competitors" VARCHAR,
  "hourly" BIGINT,
  "employer_provided" BIGINT,
  "min_salary" BIGINT,
  "max_salary" BIGINT,
  "avg_salary" DOUBLE,
  "company_txt" VARCHAR,
  "job_state" VARCHAR,
  "same_state" BIGINT,
  "age" BIGINT,
  "python_yn" BIGINT,
  "r_yn" BIGINT,
  "spark" BIGINT,
  "aws" BIGINT,
  "excel" BIGINT,
  "job_simp" VARCHAR,
  "seniority" VARCHAR,
  "desc_len" BIGINT,
  "num_comp" BIGINT
);

Glassdoor Jobs

@kaggle.thedevastator_jobs_dataset_from_glassdoor.glassdoor_jobs
  • 1.21 MB
  • 956 rows
  • 15 columns
Loading...

CREATE TABLE glassdoor_jobs (
  "unnamed_0" BIGINT,
  "job_title" VARCHAR,
  "salary_estimate" VARCHAR,
  "job_description" VARCHAR,
  "rating" DOUBLE,
  "company_name" VARCHAR,
  "location" VARCHAR,
  "headquarters" VARCHAR,
  "size" VARCHAR,
  "founded" BIGINT,
  "type_of_ownership" VARCHAR,
  "industry" VARCHAR,
  "sector" VARCHAR,
  "revenue" VARCHAR,
  "competitors" VARCHAR
);

Salary Data Cleaned

@kaggle.thedevastator_jobs_dataset_from_glassdoor.salary_data_cleaned
  • 1012.98 KB
  • 742 rows
  • 28 columns
Loading...

CREATE TABLE salary_data_cleaned (
  "job_title" VARCHAR,
  "salary_estimate" VARCHAR,
  "job_description" VARCHAR,
  "rating" DOUBLE,
  "company_name" VARCHAR,
  "location" VARCHAR,
  "headquarters" VARCHAR,
  "size" VARCHAR,
  "founded" BIGINT,
  "type_of_ownership" VARCHAR,
  "industry" VARCHAR,
  "sector" VARCHAR,
  "revenue" VARCHAR,
  "competitors" VARCHAR,
  "hourly" BIGINT,
  "employer_provided" BIGINT,
  "min_salary" BIGINT,
  "max_salary" BIGINT,
  "avg_salary" DOUBLE,
  "company_txt" VARCHAR,
  "job_state" VARCHAR,
  "same_state" BIGINT,
  "age" BIGINT,
  "python_yn" BIGINT,
  "r_yn" BIGINT,
  "spark" BIGINT,
  "aws" BIGINT,
  "excel" BIGINT
);

Share link

Anyone who has the link will be able to view this.