Baselight

Silicon Valley Diversity Data

What’s diversity like for 23 top tech companies?

@kaggle.rtatman_silicon_valley_diversity_data

Loading...
Loading...

About this Dataset

Silicon Valley Diversity Data

Context

There has been a lot of discussion of the ways in which the workforce for Silicon Valley tech companies differs from that of the United States as a whole. In particular, a lot of evidence suggests that tech workers (who tend to be more highly paid than workers in many other professions) are more likely to be white and male. This dataset will allow you to investigate the demographics for 23 Silicon Valley tech companies for yourself.

Updates!

NEW June 2018:
The spreadsheet Distributions_data_2016.csv contains workforce distributions by job category and race for 177 of the largest tech companies headquartered in Silicon Valley.

Each figure in the dataset represents the percentage of each job category that is made up of employees with a given race/gender combination, and are based on each company's EEO-1 report.

This dataset was created through a unique collaboration with the Center for Employment Equity and Reveal. The equity center provided Reveal with anonymized data for 177 large companies, and Reveal identified companies that have publicly released their data in this anonymized dataset. The equity center and Reveal analyzed the data independently.

For more information on the data, read our post here.

The spreadsheet Reveal_EEO1_for_2016.csv has been updated to include EEO-1s from companies PayPal, NetApp and Sanmina for 2016. The race and job categories have been modified to ensure consistency across all the datasets.

NEW April 2018: The spreadsheet Tech_sector_diversity_demographics_2016.csv contains aggregated diversity data for 177 large Silicon Valley tech companies. We calculated averages for the largest race and gender groups across job categories. For information on the aggregated data, read our post here.

This repository also contains EEO-1 reports filed by Silicon Valley tech companies. Please read our complete methodology for details on this data.

The data was compiled by Reveal from The Center for Investigative Reporting.

Contents

This database contains EEO-1 reports filed by Silicon Valley tech companies. It was compiled by Reveal from The Center for Investigative Reporting.

There are six columns in this dataset:

  • company: Company name
  • year: For now, 2016 only
  • race: Possible values: "American_Indian_Alaskan_Native", "Asian", "Black_or_African_American", "Latino", "Native_Hawaiian_or_Pacific_Islander", "Two_or_more_races", "White", "Overall_totals"
  • gender: Possible values: "male", "female". Non-binary gender is not counted in EEO-1 reports.
  • job_category: Possible values: "Administrative support", "Craft workers", "Executive/Senior officials & Mgrs", "First/Mid officials & Mgrs", "laborers and helpers", "operatives", "Professionals", "Sales workers", "Service workers", "Technicians", "Previous_totals", "Totals"
  • count: Mostly integer values, but contains "na" for a no-data variable.

Acknowledgements:

The EEO-1 database is licensed under the Open Database License (ODbL) by Reveal from The Center for Investigative Reporting.

You are free to copy, distribute, transmit and adapt the spreadsheet, so long as you:

  • credit Reveal (including this link if it’s distributed online);
  • inform Reveal that you are using the data in your work by emailing Sinduja Rangarajan at srangarajan@revealnews.org; and
  • offer any new work under the same license.

Inspiration:

Tables

Distributions Data 2016

@kaggle.rtatman_silicon_valley_diversity_data.distributions_data_2016
  • 29.41 KB
  • 16042 rows
  • 4 columns
Loading...

CREATE TABLE distributions_data_2016 (
  "company" VARCHAR,
  "percentage" DOUBLE,
  "demographics" VARCHAR,
  "job_category" VARCHAR
);

Reveal Eeo1 For 2016

@kaggle.rtatman_silicon_valley_diversity_data.reveal_eeo1_for_2016
  • 13.06 KB
  • 4500 rows
  • 6 columns
Loading...

CREATE TABLE reveal_eeo1_for_2016 (
  "company" VARCHAR,
  "year" BIGINT,
  "race" VARCHAR,
  "gender" VARCHAR,
  "job_category" VARCHAR,
  "count" VARCHAR
);

Tech Sector Diversity Demographics 2016

@kaggle.rtatman_silicon_valley_diversity_data.tech_sector_diversity_demographics_2016
  • 4.72 KB
  • 44 rows
  • 5 columns
Loading...

CREATE TABLE tech_sector_diversity_demographics_2016 (
  "job_category" VARCHAR,
  "race_ethnicity" VARCHAR,
  "gender" VARCHAR,
  "count" BIGINT,
  "percentage" DOUBLE
);

Share link

Anyone who has the link will be able to view this.