Venture Capital US States

This notebook shows how to pre-process the Venture capital disbursed per $1,000 of gross domestic product dataset from the National Science Foundation to create a D3 based choropleth map for states in the US.

I exported the CSV manually from the XLS file, this could also be automated using pandas.io.excel.read_excel.

The next steps are:

  • Read the CSV as a Pandas DataFrame.
  • Add a fips column used as the ID column in JavaScript.
  • Write the DataFrame to a new CSV file.
In [4]:
import pandas as pd
import geonamescache

gc = geonamescache.GeonamesCache()
df = pd.read_csv('data/venture-capital-us-states-1998-2012-1-per-1000-dollar-gdp.csv')
print(len(df))
df.head(5)
51

Out[4]:
State 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
0 Alabama 0.77 0.59 2.30 0.67 0.46 0.23 0.27 0.13 0.12 0.19 0.14 0.26 0.00 0.02 0.13
1 Alaska 0.00 0.00 0.14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2 Arizona 1.51 2.14 0.21 1.16 1.11 0.39 0.51 0.06 1.10 0.78 0.80 0.44 0.34 0.89 0.83
3 Arkansas 0.11 0.39 9.15 0.15 0.13 0.01 0.04 1.39 0.42 0.00 0.00 0.00 0.05 0.00 0.05
4 California 7.14 19.14 32.61 12.46 6.87 5.86 5.95 6.52 6.99 7.88 7.51 5.06 5.95 7.92 7.08

Below I use the geonamescache package to map state names to fips codes. Note that I prepend the string US to the fips code, as this is included in the fips property of the Admin 1 shapefile from Natural Earth Data.

In [5]:
states = gc.get_dataset_by_key(gc.get_us_states(), 'name')
df['fips'] = df['State'].apply(lambda x: 'US' + states[x]['fips'])
df.head(5)
Out[5]:
State 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 fips
0 Alabama 0.77 0.59 2.30 0.67 0.46 0.23 0.27 0.13 0.12 0.19 0.14 0.26 0.00 0.02 0.13 US01
1 Alaska 0.00 0.00 0.14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 US02
2 Arizona 1.51 2.14 0.21 1.16 1.11 0.39 0.51 0.06 1.10 0.78 0.80 0.44 0.34 0.89 0.83 US04
3 Arkansas 0.11 0.39 9.15 0.15 0.13 0.01 0.04 1.39 0.42 0.00 0.00 0.00 0.05 0.00 0.05 US05
4 California 7.14 19.14 32.61 12.46 6.87 5.86 5.95 6.52 6.99 7.88 7.51 5.06 5.95 7.92 7.08 US06
In [6]:
df.to_csv('../static/data/csv/venture-capital-us-states.csv')

Map Preview


Ramiro Gómez

About this post

This post was written by Ramiro Gómez (@yaph) and published on May 22, 2014.


blog comments powered by Disqus