Baselight

Wikipedia Edits

Dataset containing list of wikipedia edits over a period of 20 minutes

@kaggle.shradhapj_wikipedia_edits

About this Dataset

Wikipedia Edits

Context

Hey everyone out there! Wikipedia is a publicly available encyclopedia which can be modified by anyone. Some of these modifications are useful whereas some are not. This data set captures all the edits done to English Wikipedia by anyone across the globe. As there are two edits per second, the data which I have collected is for just 20 minutes.

Content

I have revised the original data set, removed the duplicates and included only the relevant and useful columns. This data set has below mentioned columns:
a) action : only edits action is captured. Other actions maybe Talk, etc.
b) change_size : the number of characters added or deleted. Positive size means the change was added and negative means the change was deleted.
c) geo_ip : This is null if the user is registered in Wikipedia otherwise it is a JSON object containing city, latitude, country_name, region_name and longitude
d) is_anonymous : This is a flag/boolean value(true/false) that notifies whether the user is registered or unregistered(anonymous)
e) is_bot : This flag/boolean value(true/false) determines if the user is a bot(robot) or a human.
f) is_minor: Thus flag/boolean value(true/false) identifies whether the change made to Wikipedia article was minor or major one.
g) page_title : This is the title of the Wikipedia article edited by the user.
h) url : This field has the URL or link which compares the Wikipedia article before and after the change.
i) user : If the user is unregistered, this field will have IP Address either in IPv4 or IPv6 format and if the user is register it will contain the username used when registering on Wikipedia.

Acknowledgements

I would like to thank hatnote.com from which I could get this data. If you need the original data you may visit www.hatnote.com or directly connect this WebSocket - ws://wikimon.hatnote.com/en/

Share link

Anyone who has the link will be able to view this.