Important note: starting from October 2nd 2019, published datasets will only be merged .csv
files and not separated by categories.
Context
I am currently working on my undergraduate thesis about sentiment analysis using my first published dataset on Kaggle, Amazon Cell Phones Reviews. Before I decided to use this dataset, I planned to use Indonesian product reviews on online stores, one of which is from Lazada Indonesia. Since there's not much product reviews on Indonesian, I decided to start this dataset collection using Lazada Indonesia's vast product categories.
Content
yyyymmdd-items.csv
contains item entries from listed categories on categories.txt
yyyymmdd-reviews.csv
contains reviews from items listed on yyyymmdd-items.csv
categories.txt
contains list of categories for currently available files
Acknowledgements
Datasets are retrieved using Puppeteer, which I also publish the project for this dataset on GitHub.