Menu, macro indicators, weather & 100K synthetic transactions
Dataset Description
What's inside
Four analysis-ready files for building a Starbucks personalized recommendation engine:
- starbucks_menu.csv — 242 Starbucks US menu items with nutrition and pricing
- fred_macro.csv — CPI, average hourly earnings, and real wage index (monthly, 2021-2026)
- weather_daily.csv — Daily temperatures for 5 US cities (2024-2026)
- synthetic_transactions.csv — 100,000 synthetic purchase transactions constrained by real data
Synthetic Data Transparency
The transaction data is synthetic — generated from probability distributions anchored to real-world constraints:
| Component | Source | Status |
|---|---|---|
| Menu items & nutrition | Starbucks published data | Real |
| Prices | Starbucks menu + estimation | Semi-real |
| CPI & wages | FRED (Federal Reserve) | Real |
| Weather | Open-Meteo historical API | Real |
| Purchase patterns | Industry reports + assumptions | Synthetic |
| Individual transactions | Generated from distributions | Synthetic |
Every assumption is documented in the generation notebook. See Notebook 3 for the full generation pipeline and Notebook 6 for validation.
Use cases
- Content-based recommendation systems
- Product design optimization
- Price elasticity analysis
- Customer segmentation
- Seasonal demand modeling
- Synthetic data generation methodology
Data sources & licenses
| File | Source | License |
|---|---|---|
| Menu nutrition | Starbucks / Kaggle public datasets | Public |
| CPI & Wages | FRED (Federal Reserve) | Public domain |
| Weather | Open-Meteo API | CC-BY 4.0 |
| Transactions | Synthetic (this project) | ODbL 1.0 |
Related notebooks (6-notebook series)
| # | Notebook | Description |
|---|---|---|
| 1 | Menu EDA | Category distribution, price/calorie analysis |
| 2 | Macro Elasticity | CPI trends, wage analysis, price sensitivity |
| 3 | Synthetic Data Generation | 100K transaction generation with real constraints |
| 4 | New Product Optimizer | Frappuccino design optimization algorithm |
| 5 | Recommendation Engine | Personalized drink + customization recommendations |
| 6 | Validation | Benchmark comparison, perturbation tests, stability |
Related Datasets
-
Coffee Sales Insights Report
@kaggle
-
Fur Banning
@owid