Data on over 500,000 recipes and 1,400,000 reviews from Food.com
Dataset Description
Context
The recipes dataset contains 522,517 recipes from 312 different categories. This dataset provides information about each recipe like cooking times, servings, ingredients, nutrition, instructions, and more.
The reviews dataset contains 1,401,982 reviews from 271,907 different users. This dataset provides information about the author, rating, review text, and more.
Content
The recipes dataset is provided in two different formats:
recipes.parquetandreviews.parquetare recommended as they preserve the schema of the original data.recipes.csvis designed to be parsed in R whilereviews.csvdoes not contain any list-columns so it can be easily parsed.
Parsing
To read recipes.csv and parse the list-column values (Images, Keywords, RecipeIngredientQuantities, RecipeIngredientParts, RecipeInstructions) in R:
library(readr)
recipes <- read_csv("recipes.csv")
print(recipes$Images[3])
## "c(\"https://img.sndimg.com/food/image/upload/w_555,h_416,c_fit,fl_progressive,q_95/v1/img/recipes/40/picJ4Sz3N.jpg\", \"https://img.sndimg.com/food/image/upload/w_555,h_416,c_fit,fl_progressive,q_95/v1/img/recipes/40/pic23FWio.jpg\")"
print(eval(parse(text = recipes$Images[3])))
## "https://img.sndimg.com/food/image/upload/w_555,h_416,c_fit,fl_progressive,q_95/v1/img/recipes/40/picJ4Sz3N.jpg"
## "https://img.sndimg.com/food/image/upload/w_555,h_416,c_fit,fl_progressive,q_95/v1/img/recipes/40/pic23FWio.jpg"
To parse ISO 8601 duration format values (CookTime, PrepTime, and TotalTime) in R:
library(lubridate)
duration("PT24H45M")
## "89100s (~1.03 days)"
Related Datasets
-
Food Composition
@kaggle