Baselight
Sign In
kaggle

Starbucks Recommendation Engine Data

Kaggle

@kaggle.shiratoriseto_starbucks_recommendation_engine

Loading...
Loading...

Menu, macro indicators, weather & 100K synthetic transactions

Dataset Description

What's inside

Four analysis-ready files for building a Starbucks personalized recommendation engine:

  1. starbucks_menu.csv — 242 Starbucks US menu items with nutrition and pricing
  2. fred_macro.csv — CPI, average hourly earnings, and real wage index (monthly, 2021-2026)
  3. weather_daily.csv — Daily temperatures for 5 US cities (2024-2026)
  4. synthetic_transactions.csv — 100,000 synthetic purchase transactions constrained by real data

Synthetic Data Transparency

The transaction data is synthetic — generated from probability distributions anchored to real-world constraints:

Component Source Status
Menu items & nutrition Starbucks published data Real
Prices Starbucks menu + estimation Semi-real
CPI & wages FRED (Federal Reserve) Real
Weather Open-Meteo historical API Real
Purchase patterns Industry reports + assumptions Synthetic
Individual transactions Generated from distributions Synthetic

Every assumption is documented in the generation notebook. See Notebook 3 for the full generation pipeline and Notebook 6 for validation.

Use cases

  • Content-based recommendation systems
  • Product design optimization
  • Price elasticity analysis
  • Customer segmentation
  • Seasonal demand modeling
  • Synthetic data generation methodology

Data sources & licenses

File Source License
Menu nutrition Starbucks / Kaggle public datasets Public
CPI & Wages FRED (Federal Reserve) Public domain
Weather Open-Meteo API CC-BY 4.0
Transactions Synthetic (this project) ODbL 1.0

Related notebooks (6-notebook series)

# Notebook Description
1 Menu EDA Category distribution, price/calorie analysis
2 Macro Elasticity CPI trends, wage analysis, price sensitivity
3 Synthetic Data Generation 100K transaction generation with real constraints
4 New Product Optimizer Frappuccino design optimization algorithm
5 Recommendation Engine Personalized drink + customization recommendations
6 Validation Benchmark comparison, perturbation tests, stability

Related Datasets

Share link

Anyone who has the link will be able to view this.