Mobile Phones in Use

Retrieve data on mobile phone use from Wikipedia's List of countries by number of mobile phones in use, do some cleanup and save it on a format appropriate for rendering a map with d3.geomap.

In [1]:
import requests
import io
import re

import pandas as pd
import geonamescache

from geonamescache import mappings

gc = geonamescache.GeonamesCache()
cnames = gc.get_countries_by_names()

url = 'http://wikitables.geeksta.net/dl/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FList_of_countries_by_number_of_mobile_phones_in_use&idx=0'

re_num = re.compile(r'^[\d,.]+$')

def fix_num(x):
    if (isinstance(x, str) and re.search(re_num, x)):
        x = x.replace(',', '')
        if '.' in x:
            x = float(x)
        else:
            x = int(x)
    return x

Download the data as CSV, read it into a Pandas DataFrame and convert numbers to floats and integers.

In [2]:
csv = requests.get(url).text
df = pd.read_csv(io.StringIO(csv))
df = df.applymap(fix_num)
df.head()
Out[2]:
Rank Country or region Number of mobile phones Population Connections/100 citizens Data evaluation date
0 - World 6,800,000,000+ 7012000000 97.00 2013
1 1 China 1227360000 1349585838 89.20 December 2013
2 2 India 904510000 1220800359 74.09 31 March 2014
3 3 United States 327577529 317874628 103.10 April 2014
4 4 Brazil 273583000 201032714 136.45 March 2014

5 rows × 6 columns

Remove the row for world so the values won't be considered for any calculations.

In [3]:
df = df[df['Country or region'] != 'World']

Map country names to iso3 codes.

In [4]:
def get_iso3(name):
    if name in mappings.country_names:
        name = mappings.country_names[name]
    return cnames[name]['iso3']

df['iso3'] = df['Country or region'].apply(get_iso3)

Delete the Rank column and save the data as a CSV file.

In [5]:
del df['Rank']
df.to_csv('../static/data/csv/mobile-phones-in-use.csv', encoding='utf-8', index=False)

Map Preview


Ramiro Gómez

About this post

This post was written by Ramiro Gómez (@yaph) and published on July 28, 2014.


blog comments powered by Disqus