Baselight

GSMArena Phone Dataset

8000+ phones specifications scraped from GSMArena Website

@kaggle.arwinneil_gsmarena_phone_dataset

About this Dataset

GSMArena Phone Dataset

Disclaimer : Dataset not being updated because it seemingly puts a lot of load on GSMArena Servers and event result in IP address bans in some cases. The source code is however available on GitHub and is to be run at your own discretion. Thanks!

Context

GSMArena Phones Dataset is a labeled dataset extracted from GSMArena , one of the most popular online provider of phone information, and holds a large collection of phone specification. The original purpose of this dataset was data exploration and potentially some machine learning.

The Dataset

There are 108 unique phone brands with 39 variables:
network_technology , 2G_bands , 3G_bands , 4G_bands, network_speed, GPRS, EDGE ,announced ,status ,dimentions ,weight_g ,weight_oz ,SIM ,display_type ,display_resolution ,display_size ,OS ,CPU ,Chipset , GPU ,memory_card ,internal_memory ,RAM ,primary_camera ,secondary_camera ,loud_speaker ,audio_jack ,WLAN ,bluetooth ,GPS ,NFC ,radio ,USB ,sensors ,battery ,colors ,approx_price_EUR ,img_url

Notes

  • Multivalued columns use "|" or "/" as delimiters.
  • The time period when the data was scraped will be mentioned in the dataset description below. The number of devices, prices and other variable are bound to change with time.
  • More details available about variables in column description below.

The Scraper

The dataset was scraped using a little CLI I wrote in C#, check it out on my Github
https://github.com/arwinneil/phone-dataset

Share link

Anyone who has the link will be able to view this.