Claude.ai Usage Data
Anthropic Economic Index : Understanding AI’s effects on the economy
@kaggle.yashdogra_anthropic
Anthropic Economic Index : Understanding AI’s effects on the economy
@kaggle.yashdogra_anthropic
This document describes the data sources and variables used in the third Anthropic Economic Index (AEI) report.
The core dataset contains Claude AI usage metrics aggregated by geography and analysis dimensions (facets).
Source files:
aei_raw_claude_ai_2025-08-04_to_2025-08-11.csv
(pre-enrichment data in data/intermediate/)aei_enriched_claude_ai_2025-08-04_to_2025-08-11.csv
(enriched data in data/output/)Note on data sources: The AEI raw file contains raw counts and percentages. Derived metrics (indices, tiers, per capita calculations, automation/augmentation percentages) are calculated during the enrichment process in aei_report_v3_preprocessing_claude_ai.ipynb
.
Each row represents one metric value for a specific geography and facet combination:
Column | Type | Description |
---|---|---|
geo_id |
string | Geographic identifier (ISO-2 country code for countries, US state code, or "GLOBAL", ISO-3 country codes in enriched data) |
geography |
string | Geographic level: "country", "state_us", or "global" |
date_start |
date | Start of data collection period |
date_end |
date | End of data collection period |
platform_and_product |
string | "Claude AI (Free and Pro)" |
facet |
string | Analysis dimension (see Facets below) |
level |
integer | Sub-level within facet (0-2) |
variable |
string | Metric name (see Variables below) |
cluster_name |
string | Specific entity within facet (task, pattern, etc.). For intersections, format is "base::category" |
value |
float | Numeric metric value |
Variables follow the pattern {prefix}_{suffix}
with specific meanings:
From AEI processing: *_count
, *_pct
From enrichment: *_per_capita
, *_per_capita_index
, *_pct_index
, *_tier
, automation_pct
, augmentation_pct
, soc_pct
O*NET Task Metrics:
Request Metrics:
Collaboration Pattern Metrics:
not_classified
and none
categories from index calculations as they are not meaningfulDataset containing first-party API usage metrics along various dimensions based on a sample of 1P API traffic and analyzed using privacy-preserving methods.
Source file: aei_raw_1p_api_2025-08-04_to_2025-08-11.csv
(in data/intermediate/)
Each row represents one metric value for a specific facet combination at global level:
Column | Type | Description |
---|---|---|
geo_id |
string | Geographic identifier (always "GLOBAL" for API data) |
geography |
string | Geographic level (always "global" for API data) |
date_start |
date | Start of data collection period |
date_end |
date | End of data collection period |
platform_and_product |
string | "1P API" |
facet |
string | Analysis dimension (see Facets below) |
level |
integer | Sub-level within facet (0-2) |
variable |
string | Metric name (see Variables below) |
cluster_name |
string | Specific entity within facet. For intersections, format is "base::category" or "base::index"/"base::count" for mean value metrics |
value |
float | Numeric metric value |
O*NET Task Metrics:
Mean Value Intersection Metrics (unique to API data):
Request Metrics:
We use external data to enrich Claude usage data with external economic and demographic sources.
ISO 3166 Country Codes
International standard codes for representing countries and territories, used for mapping IP-based geolocation data to standardized country identifiers.
geonames_countryInfo.txt
(raw GeoNames data in data/input/)iso_country_codes.csv
(processed country codes with some changes in data/intermediate/)iso_alpha_2
: Two-letter country code (e.g., "US", "GB", "FR")iso_alpha_3
: Three-letter country code (e.g., "USA", "GBR", "FRA")country_name
: Country name from GeoNamesState FIPS Codes and USPS Abbreviations
Official state and territory codes including FIPS codes and two-letter USPS abbreviations for all U.S. states, territories, and the District of Columbia.
census_state_codes.txt
(raw pipe-delimited text file in data/input/)State Characteristics Estimates - Age and Sex - Civilian Population
Annual estimates of the civilian population by single year of age, sex, race, and Hispanic origin for states and the District of Columbia.
sc-est2024-agesex-civ.csv
(raw Census data in data/input/)working_age_pop_2024_us_state.csv
(processed data summed for ages 15-64 by state in data/intermediate/)Population ages 15-64, total
Total population between the ages 15 to 64. Population is based on the de facto definition of population, which counts all residents regardless of legal status or citizenship.
working_age_pop_2024_country_raw.csv
(raw World Bank data in data/input/)working_age_pop_2024_country.csv
(processed country-level data including Taiwan in data/intermediate/)Population by single age
Population projections by single year of age for Taiwan (Republic of China). This data supplements the World Bank country data which excludes Taiwan.
df_taiwan
(raw data), added to df_working_age_pop_country
Population by single age _20250802235608.csv
(raw data in data/input/, pre-filtered to ages 15-64)working_age_pop_2024_country.csv
(processed country-level data in data/intermediate/)Gross Domestic Product, Current Prices (Billions of U.S. Dollars)
Total gross domestic product at current market prices for all countries and territories.
imf_gdp_raw_2024.json
(raw API response in data/input/)gdp_2024_country.csv
(processed country GDP data in data/intermediate/)SASUMMARY State Annual Summary Statistics: Personal Income, GDP, Consumer Spending, Price Indexes, and Employment
Gross domestic product by state in millions of current U.S. dollars.
bea_us_state_gdp_2024.csv
(raw data in data/input/, manually downloaded from BEA)gdp_2024_us_state.csv
(processed state GDP data in data/intermediate/)O*NET Task Statements Dataset
Comprehensive database of task statements associated with occupations in the O*NET-SOC taxonomy, providing detailed work activities for each occupation.
onet_task_statements_raw.xlsx
(raw Excel file in data/input/)onet_task_statements.csv
(processed data with soc_major_group in data/intermediate/)O*NET-SOC Code
: Full occupation code (e.g., "11-1011.00")Title
: Occupation titleTask ID
: Unique task identifierTask
: Description of work taskTask Type
: Core or Supplementalsoc_major_group
: First 2 digits of SOC code (e.g., "11" for Management)Standard Occupational Classification (SOC) Structure
Hierarchical classification system for occupations, providing standardized occupation titles and codes.
df_soc
(SOC structure dataframe)soc_structure_raw.csv
(raw data in data/input/)soc_structure.csv
(processed SOC structure in data/intermediate/)Major Group
: SOC major group code (e.g., "11-0000")Minor Group
: SOC minor group codeBroad Occupation
: Broad occupation codeDetailed Occupation
: Detailed occupation codesoc_major_group
: 2-digit major group code (e.g., "11")SOC or O*NET-SOC 2019 Title
: Occupation group titleCore questions, National.
BTOS_National.xlsx
CREATE TABLE aei_raw_1p_api_2025_08_04_to_2025_08_11 (
"geo_id" VARCHAR,
"geography" VARCHAR,
"date_start" TIMESTAMP,
"date_end" TIMESTAMP,
"platform_and_product" VARCHAR,
"facet" VARCHAR,
"level" BIGINT,
"variable" VARCHAR,
"cluster_name" VARCHAR,
"value" DOUBLE
);
CREATE TABLE aei_raw_claude_ai_2025_08_04_to_2025_08_11 (
"geo_id" VARCHAR,
"geography" VARCHAR,
"date_start" TIMESTAMP,
"date_end" TIMESTAMP,
"platform_and_product" VARCHAR,
"facet" VARCHAR,
"level" BIGINT,
"variable" VARCHAR,
"cluster_name" VARCHAR,
"value" DOUBLE
);
Anyone who has the link will be able to view this.