Background

In 2020, Minnesota’s Department of Transportaion put out a great visualization of Minneapolis’ surge in bicycle and pedestrian usage soon after Minneapolis governor Tim Walz declared a ‘Peacetime Emergency’ on March 13, 2020 and a ‘Stay at Home’ order twelve days later on March

Goal

We will similarly explore Arlington, Virginia’s bicycle ridership numbers around Virginia Governor Northram’s State of Emergency Declaration on March 12, 2020 and Stay at Home order on March 30, 2020.

Below is the plot you will create using Arlington’s bike data.

Follow along or explore on your own

Subsequent posts for this project will outline step by step how I went about creating similar visualizations to the Minnesota DOT. I believe this is a great project for those new to Python, R, or databases. Each Part of this project will have a Goal section. You have the option to give the goals a shot on your own before looking at how I accomplished them. If you have any questions, comments, or are working on your own analysis, send me an email!

Requirements

Before you start the project, the requirements are listed below. I’ll document as best I can how to set up the MySQL database and connect everything together in part 3.

  1. Python 3.6 or later
  2. R/R Studio
  3. Very basic Python and R knowledge

Note: To make my life easier, I installed the full Anaconda suite and use the Spyder IDE when working in Python. I played around with installing Miniconda but when it comes to managing the conda environments in R using the Reticulate library, it gave me a headache.

The Datasets

One of the best parts of living in Arlington, Virginia is its beautiful paved bike paths, particularly the Arlington Loop. Riders of the loop probably have noticed the Bikeometer near the Francis Scott Key Memorial Bridge displaying the daily, monthly, and yearly bicycle counts.

To my delight, these numbers are not only publicly available, but also easily requested using an API and detailed documentation. Additionally, Covid-19 case/vaccine data from the Virginia Health Department will be used to visualize how trail usage is affected by major events like a global pandemic.

Python

Python will be used to automate a number of actions: 1. Request ridership counts from the BikeArlington API 2. Request Covid-19 data from the Virginia Health Department on Total Number of Cases per Locality and Total Vaccines Administered 3. Clean and format the data 4. Store the data in a database 5. Automate steps 1-3 for new data

MySQL

One of my aims for this project was to create a pipeline that foregoes saving individual ‘data’ files (e.g. csv, xml, json, txt) to the computer and instead used a database to store the data. After reading over this great article that gives a good overview of SQLite, PostgreSQL, and MySQL, I decided on MySQL to handle my database. A second factor in my decision was because my favorite Udemy instructor Colt Steele had a MySQL course on sale The Ultimate MySQL Bootcamp: Go from SQL Beginner to Expert. I highly recommend it to learn MySQL.

R

R will be used to pull data out of the MySQL database and create visualizations. According to the Minnesota Department of Transportation’s report: Minnesota’s Walking and Bicycling Data Collection Report Update Annual Data from 2014 to 2019, “When irregularities are found, the data are run through a statistical model in ‘R Studio’ and cleaned up.” I hope to employ similar methods to analyze and clean the data in R.

Short Bike Story

Recently, after helping teach my friend how to ride a bike, we excitedly visited the local bike shop to hopefully make a purchase. Unfortunately, over a year into the pandemic, we were told if we ordered today, the bike would be delivered in 6 months.

Later that day, I visited the local used bike shop and was excitedly told that for $400, I could have a refurbished 1997 Bianchi Timber Wolf, a bike which according to BikePedia.com, originally sold for $300. Either this bike had holographic Charizards in its spokes, or demand for bikes is still very high.

Check out Part 2, Query the Bike Arlington API

