Global Food Nutrition Database(10k Products)
9,804 Global Products with Nutri-Score,Complete Macronutrients and Ingredients
@kaggle.kanchana1990_global_food_nutrition_database10k_products
9,804 Global Products with Nutri-Score,Complete Macronutrients and Ingredients
@kaggle.kanchana1990_global_food_nutrition_database10k_products
This dataset contains a curated, high-quality snapshot of 9,804 global food products, aggregated from the Open Food Facts database as of December 2025. It provides a comprehensive multi-dimensional view of food health through three validated scoring systems:
Nutri-Score Prediction (Multi-Class Classification)
Predict nutritional quality grades (A-E) based on macronutrient composition (fat_100g, sugars_100g, saturated-fat_100g, salt_100g).
Ultra-Processed Food Detection (Binary Classification)
Identify NOVA Group 4 products (53.5% of dataset) using ingredient lists and nutritional patterns.
Ingredient NLP & Allergen Detection
Extract allergens, E-numbers, and food additives from ingredients_text using Named Entity Recognition.
Environmental Impact Clustering
Group products by ecoscore_grade and nova_group to identify sustainable food categories.
Calorie Regression
Predict energy-kcal_100g from macronutrient breakdown.
Brand Health Profiling
Rank 3,501 brands by average Nutri-Score and percentage of ultra-processed products.
Category-Based Benchmarking
Analyze nutritional performance across 7,408 unique food categories.
| Column | Type | Coverage | Description |
|---|---|---|---|
code |
int64 | 100.00% | Unique barcode/EAN identifier (e.g., 6111035000430). Primary key. |
product_name |
object | 97.57% | Commercial product name as displayed on packaging. |
brands |
object | 96.19% | Manufacturer/Brand name (e.g., Nestlé, Danone, Heinz, Hacendado, Tesco). |
countries |
object | 99.98% | Countries where product is sold (may include multiple, e.g., "France, United Kingdom"). |
quantity |
object | 89.92% | Package size/weight (e.g., "500 ml", "100 g", "2 L"). Mixed formatting. |
| Column | Type | Coverage | Description |
|---|---|---|---|
categories |
object | 98.77% | Hierarchical product categories (comma-separated), e.g., "Beverages,Waters,Mineral waters". 7,408 unique combinations. |
labels |
object | 77.03% | Certification/Quality labels (e.g., "Organic", "Fair Trade", "Green Dot"). Multiple labels per product. 5,687 unique combinations. |
| Column | Type | Coverage | Description |
|---|---|---|---|
nutriscore_grade |
object | 99.97% | Official Nutri-Score (A=Healthiest to E=Least Healthy). Includes "not-applicable" (2.98%) and "unknown" (8.20%). Distribution: A=16.11%, B=12.70%, C=22.48%, D=18.23%, E=19.28%. |
ecoscore_grade |
object | 99.98% | Environmental impact score (A+=Best to F=Worst). Considers carbon footprint, water use, packaging recyclability. Distribution: A+=5.41%, A=15.46%, B=18.59%, C=13.39%, D=14.29%, E=10.18%, F=2.52%. |
nova_group |
float64 | 91.38% | NOVA processing classification (1=Unprocessed, 2=Processed ingredients, 3=Processed foods, 4=Ultra-processed). Distribution: Group 1=11.02%, Group 2=4.15%, Group 3=22.69%, Group 4=53.52%. |
| Column | Type | Coverage | Description |
|---|---|---|---|
ingredients_text |
object | 95.94% | Full ingredient list (unstructured text). Multi-language (English, French, Spanish, Arabic). Contains allergens, E-numbers, additives. Perfect for NLP/NER tasks. |
| Column | Type | Coverage | Mean | Median | Description |
|---|---|---|---|---|---|
energy-kcal_100g |
float64 | 94.66% | 292.75 | 280.0 | Energy in kilocalories per 100g. Waters = 0 kcal, oils ≈900 kcal. |
fat_100g |
float64 | 94.95% | 15.32 | 7.02 | Total fat content per 100g. Low-fat: <3g, high-fat: >17.5g (EU thresholds). |
saturated-fat_100g |
float64 | 93.23% | 5.25 | 1.50 | Saturated fatty acids per 100g. Key penalty factor in Nutri-Score calculation. |
carbohydrates_100g |
float64 | 94.70% | 30.19 | 18.0 | Total carbohydrates per 100g (sugars + fiber + starch). |
sugars_100g |
float64 | 93.51% | 11.98 | 4.20 | Total sugars per 100g (glucose, fructose, lactose, sucrose). High-sugar threshold: >22.5g. |
fiber_100g |
float64 | 69.74% | 4.09* | 2.90 | Dietary fiber per 100g. *Note: 2 outlier products exceed 100g/100g (data quality issue). Mean calculated after excluding outliers >50g. |
proteins_100g |
float64 | 94.90% | 7.33 | 6.20 | Protein content per 100g. High-protein products: >20g/100g. |
salt_100g |
float64 | 93.24% | 1.24 | 0.39 | Sodium chloride (salt) per 100g. WHO guideline: <5g/day total intake. |
sodium_100g |
float64 | 93.24% | 0.50 | 0.16 | Elemental sodium per 100g. Conversion: Salt (g) = Sodium (g) × 2.5. |
CREATE TABLE openfoodfacts_nutrition_final_2025_12_10 (
"code" BIGINT,
"product_name" VARCHAR,
"brands" VARCHAR,
"countries" VARCHAR,
"quantity" VARCHAR,
"categories" VARCHAR,
"labels" VARCHAR,
"nutriscore_grade" VARCHAR,
"ecoscore_grade" VARCHAR,
"nova_group" DOUBLE,
"ingredients_text" VARCHAR,
"energy_kcal_100g" DOUBLE,
"fat_100g" DOUBLE,
"saturated_fat_100g" DOUBLE,
"carbohydrates_100g" DOUBLE,
"sugars_100g" DOUBLE,
"fiber_100g" DOUBLE,
"proteins_100g" DOUBLE,
"salt_100g" DOUBLE,
"sodium_100g" DOUBLE
);Anyone who has the link will be able to view this.