Baselight

TripAdvisor Restaurant Survival

How many restaurants from 2021 are still open?

@kaggle.artemfedorov_restraunt_survival

About this Dataset

TripAdvisor Restaurant Survival

About Dataset

Context

TripAdvisor is the most popular travel website, and it stores data for almost all restaurants, showing locations (even latitude and longitude coordinates), restaurant descriptions, user ratings and reviews, and many more aspects.

The website is well known for displaying user ratings and reviews for restaurants, hotels, b&b, tourist attractions, and other places, with a total of a billion reviews.

Content

The dataset includes all restaurants from four of European countries (Northern Ireland (UK), Slovakia, Bulgaria and Finland) present in the TripAdvisor European restaurants dataset.

So many things have changed since 2021, and it is interesting to see how many of those restaurants are still open, and also to check the differences between countries.

This dataset is a great way to see how European restaurants change. It also contains some additional data that is not present in the original dataset, including URLs of restaurant websites, information on the languages of all the reviews for each restaurant, and the date of the last review published by users for each restaurant. Additionally, some information was gathered using the restaurant's website URL. A significant number of restaurants list their Facebook accounts as their websites, and information from those Facebook accounts has also been gathered.

Acknowledgements

Data has been retrieved from the publicly available website https://tripadvisor.com/.

The TripAdvisor pages were scrapped in late October 2023 using restaurant URLs found in the column "restaurant_link" of the original TripAdvisor European Restaurants dataset.
Data from restaurants' websites was gathered about a week later, in the last week of October 2023.

Inspiration

Inspired by the TripAdvisor European restaurants dataset.

P.S.

I've removed the original dataset, that was scraped in late August/early September 2023 and replaced with dataset scraped in late October 2023. There were a few problems earlier with dataset, columns "service_rating", "value_rating" and "atmosphere_rating" had wrong information in almost all cases when thery were not empty. New dataset has no column "default language", as for the new dataset it has been checked that for all the pages script pressing the "all languages" button worked correctly, and the selected language is always the same.

Share link

Anyone who has the link will be able to view this.