I wrote a simple Python script that uses the wikipedia API (https://en.wikipedia.org/w/api.php) to collect al the articles on English Wikipedia (for another project). One of the parameters for ‘query’ is called coordinates. It contains lat, lon and globe on articles about places, events, people, etc. Most of the articles are about locations on earth but I’ve als een few thousand locations on the moon, mars, IO, etc, etc.
You can use datashader (https://datashader.org) to plot a map. It’s a great library if you want to plot large(r) sets.
The way I see it there are multiple factors why some countries are disproportionate on the map:
- Scholarly interest
- Scholarly level
- Language
- Density of population
- Angloshpere of (former) British colonies
- Travel interest
- History itself but also the level of documentation on it
I also find it interesting to consider the fact that Wikipedia is often used as a resource in many levels of education. It may be an exaggeration, but you could see this picture as a visualisation of language barrier and in some cases as a loss of knowledge.
Next step is to collect al coordinates on the “other” Wikipedias and see if that “fills in the blanks”. And if so, if there’s a way to combine those languages/versions.