I’ve been looking for a way to apply some of my data skills to better understanding what is going on in the world of politics lately. Like many people, I was surprised by the outcome of the last election, and I wanted to see if I could work with some data to help me better understand it.
Under a new administration, the data is also taking on a new meaning to me, as I realize that there may be hints or clues in the data about the last presidential election that will bear on the next congressional races next year.
Not to mention the fact that we’ve heard President Trump claim that voter fraud is going on; so this data is as important as ever.
A lot has been made about the totals of the popular vote, but of course elections in America are won and lost on the basis of the electoral college. In order for us to really understand the results of the elections, we’ll have to look at individual counties that voted and how that added up to how states assigned their electoral votes.
With that in mind I went looking for some data. I found a helpful compilation online. Not only is this some raw data, but it is also set up as a Jupyter notebook, so the process of analyzing it is made open for others to observe.
In this case, the notebook uses the Python language to work with the data.
The data that is compiled includes:
- County-level Results from the 2016 election, compiled by TownHall.com
- Naming data about counties from the US Census Bureau.
- Country-level Results from the 2012 election, compiled by the Guardian.
The notebook is neat because you can run for yourself all the commands from downloading the raw data, to putting it into data structures to running comparisons.
In the next part, I’ll actually use these foundations to start asking some interesting questions.