Retrieve data on mobile phone use from Wikipedia's List of countries by number of mobile phones in use, do some cleanup and save it on a format appropriate for rendering a map with d3.geomap.
import requests
import io
import re
import pandas as pd
import geonamescache
from geonamescache import mappings
gc = geonamescache.GeonamesCache()
cnames = gc.get_countries_by_names()
url = 'http://wikitables.geeksta.net/dl/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FList_of_countries_by_number_of_mobile_phones_in_use&idx=0'
re_num = re.compile(r'^[\d,.]+$')
def fix_num(x):
if (isinstance(x, str) and re.search(re_num, x)):
x = x.replace(',', '')
if '.' in x:
x = float(x)
else:
x = int(x)
return x
Download the data as CSV, read it into a Pandas DataFrame
and convert numbers to floats and integers.
csv = requests.get(url).text
df = pd.read_csv(io.StringIO(csv))
df = df.applymap(fix_num)
df.head()
Remove the row for world so the values won't be considered for any calculations.
df = df[df['Country or region'] != 'World']
Map country names to iso3 codes.
def get_iso3(name):
if name in mappings.country_names:
name = mappings.country_names[name]
return cnames[name]['iso3']
df['iso3'] = df['Country or region'].apply(get_iso3)
Delete the Rank
column and save the data as a CSV file.
del df['Rank']
df.to_csv('../static/data/csv/mobile-phones-in-use.csv', encoding='utf-8', index=False)
IPython Interactive Computing and Visualization Cookbook
Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Python Data Visualization Cookbook
Links to Amazon and Zazzle are associate links, for more info see the disclosure.
This post was written by Ramiro Gómez (@yaph) and published on July 28, 2014.