Mapping the Bikeometers
I’ll map the locations of the Bikeometers to see where exactly they are in Arlington. To accomplish this, I found this video by Professor Lisa Lendway very helpful along with her website.
On the left are the numbers we need to enter into the bbox variable below in order to get the graph from Stamen Maps. I’ve chosen ‘terrain’ as my map type and the ‘zoom’ will be 13, which is in the openstreetmap url just after the text ‘map=’.
library(ggmap)
library(ggthemes)
# Get the map information
arlington <- get_stamenmap(
bbox = c(left = -77.2377, bottom = 38.8075, right = -76.9744, top = 38.9080),
maptype = "terrain",
zoom = 12)
Next we will import the Bikeometer data from my database that contains the longitude and latitudes needed to plot the Bikeometers.
library(DBI)
library(dplyr)
con <- dbConnect(odbc::odbc(), dsn = "Bike_MySQL")
sql_cmd <- "SELECT * FROM counts.bikeometer_details WHERE bikeometer_id in (14,15,16,18,22,31,39)"
# creates a lazy table
bikeometer_table <- dbGetQuery(con, sql_cmd)
bikeometer_table
One Bikeometer stands out as not even being in Alrington… how did that get in there? Let’s plot it anyway to see how close it is to Arlington. Realistically, I frequently bike into Alexandria on the weekends but I’ll probably end up excluding it none the less.
# Convert latitude and longitude to integers
bikeometer_table$longitude <- as.numeric(bikeometer_table$longitude)
bikeometer_table$latitude <- as.numeric(bikeometer_table$latitude)
# Plot the points on the map
ggmap(arlington) + # creates the map "background"
geom_point(data = bikeometer_table,
aes(x = longitude, y = latitude, color = region),
alpha = 0.5,
size = 3) +
theme_map()
From my experience, the most popular bike trail in Arlington will be on the Arlington Loop. I want to get a sense if these Bikeometers are on the Arlington Loop, and if not, choose Bikeometers that are on the loop. My hypothesis is that bike trails that weren’t popular before the pandemic won’t see much change after the pandemic. I generally either ride the Arlington Loop or the W&OD Trails as they are the most bike-friendly. A trail that is less bike-friendly before the pandemic won’t become that much more bike-friendly during a pandemic, even if there is a reduction of traffic in the city.
I’ll use the GPS coordinates from my Garmin GPS watch to create a layer for my plot that shows the Arlington Loop trail. I downloaded the gpx file from the Garmin Connect website.
I used the websites GPXStudio and MyGeodata Converter to remove unnecessary data points and convert from GPX to CSV respectively.
I read in the CSV which I’ll use to plot the Arlington Loop.
arlington_loop_table <- read.csv('~/Github/Arlington_Bikeometer_Visualizations/Data/arlington_loop.csv')
arlington_loop_table
I passed the ‘arlington’ map object to ggmap to set the background image to Arlington then plotted the Arlington Loop using geom_point. I used the ‘longitude’ column as the x-axis, ‘latitude’ column as the y-axis, ‘alpha’ is the transparency of the points (0 to 1) and ‘size’ is the size of each point. I used ‘theme_map’ without ‘x’ and ‘y’ axes.
ggmap(arlington) + # creates the map "background"
geom_point(data = arlington_loop_table,
aes(x = longitude, y = latitude),
alpha = 0.5,
size = .5) +
theme_map()
Next, I’ll create an Arlington Loop layer so I can overlay it on top of the Bikeometer location map. I found two options to create this layer: geom_polygon or geom_path. The biggest diffrence I see is that you can specify a fill for geom_polygon, but considering I don’t want to fill the layer in, I’ll use geom_path.
ggmap(arlington) + geom_polygon(aes(x = longitude, y = latitude),
data = arlington_loop_table, fill = 'blue', alpha = 0.1, size = 1, color = "red")
ggmap(arlington) + geom_path(data = arlington_loop_table, aes(x = longitude, y = latitude), size = 1, color = 'red') + theme_map()
If I wanted the path to change color according to elevation, I could use the below <color = ele> argument. This will generate a legend on the map. By adding the <theme(legend.background = element_blank())> code, you remove the background from the legend. However I won’t be using this code.
ggmap(arlington) + geom_path(data = arlington_loop_table, aes(x = longitude, y = latitude, color = ele), size = 1) + theme_map() + theme(legend.background = element_blank())
I’ll assign the geom_path to the arlington_loop_layer variable to be plotted on top of the plot with the location of the Bikeometes.
arlington_loop_layer <- geom_path(data = arlington_loop_table, aes(x = longitude, y = latitude), size = 1, color = 'red')
ggmap(arlington) + # creates the map "background"
geom_point(data = bikeometer_table,
aes(x = longitude, y = latitude, color = region),
alpha = 0.5,
size = 3) +
arlington_loop_layer +
theme_map()
It looks like 1 or 2 Bikeometers are on the trail. Let’s see which Bikeometers are actually on the Arlington Loop. I can use that data to pick some popular trails to visualize.
sql_cmd <- "SELECT * FROM counts.bikeometer_details"
# creates a lazy table
all_bikeometer_table <- dbGetQuery(con, sql_cmd)
all_bikeometer_table$longitude <- as.numeric(all_bikeometer_table$longitude)
all_bikeometer_table$latitude <- as.numeric(all_bikeometer_table$latitude)
all_bikeometer_table
To figure out labeling, I used this site.
ggmap(arlington) + # creates the map "background"
geom_point(data = all_bikeometer_table,
aes(x = longitude, y = latitude, color = region),
alpha = 0.5,
size = 10) +
arlington_loop_layer +
geom_text(data = all_bikeometer_table, nudge_x = 0.01, color='black', aes(x = longitude, y = latitude,
label = bikeometer_id, inherit.aes= FALSE)) +
theme_map()
## Warning: Ignoring unknown aesthetics: inherit.aes
## Warning: Removed 11 rows containing missing values (geom_point).
## Warning: Removed 11 rows containing missing values (geom_text).
Digression
For some reason, bikeometer_id 6 isn’t in my database. I’ll quickly convert my list of Bikeometers into integers so I can use the sorted() function in python to order them.
bikeometer_list = ['33','30','43','24','59','56','47','48','10','20',
'35','57','18','3','58','61','62','38','44','14',
'60','5','6','42','37','27','26','8','7','51','52',
'45','22','21','36','34','41','9','39','16','15',
'54','55','31','28','11','2','25','19']
# Convert to integers
bikeometer_list = [int(i) for i in bikeometer_list]
sorted_bikeometer_list = sorted(bikeometer_list)
sorted_bikeometer_list
## [2, 3, 5, 6, 7, 8, 9, 10, 11, 14, 15, 16, 18, 19, 20, 21, 22, 24, 25, 26, 27, 28, 30, 31, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43, 44, 45, 47, 48, 51, 52, 54, 55, 56, 57, 58, 59, 60, 61, 62]
With them ordered, I’ll check to see which Bikeometers are in my database.
It looks like Bikeometers 6 and 52 are not in my database. When I request data from them for a few different time ranges, it looks like there is no data for either of them, which explains why they aren’t in my database. One mystery solved!
Digression Over
Right now, the bikeometer_id labels are ugly. If we don’t use the ‘nudge’ parameter, it makes some a little more readable and some a lot less readable. My next goal is to make the Bikeometer locations easier to read and see so I can pick the ones that are on the Arlington Loop. Ideally, each point would have an arrow pointing to its location with a label at the other end of the arrow. I wonder if there is a package that can make something like this…
---
title: "Visualizing Arlington Bikometers"
subtitle: "Part 7: Investigating Bikeometer GPS Locations"
output:
  html_document: 
    toc: yes
    toc_depth: 2
    toc_float: yes
    highlight: zenburn
    code_download: true
    df_print: paged
    includes:
      in_header: header.html
---

   

# Goal

Compare the GPS data of the chosen Bikeometers to the popular trail: Arlington Loop.

# Mapping the Bikeometers

I'll map the locations of the Bikeometers to see where exactly they are in Arlington. To accomplish this, I found [this video](https://www.youtube.com/watch?v=2k8O-Y_uiRU&t=3s) by Professor Lisa Lendway very helpful along with [her website](https://mapping-in-r.netlify.app/).

![](images/Arlington%20Open%20Street%20Map.png "Arlington Open Street Map bbox")

On the left are the numbers we need to enter into the bbox variable below in order to get the graph from [Stamen Maps](http://maps.stamen.com/). I've chosen 'terrain' as my map type and the 'zoom' will be 13, which is in the openstreetmap url just after the text 'map='.

```{r message=FALSE}
library(ggmap)
library(ggthemes)
# Get the map information
arlington <- get_stamenmap(
    bbox = c(left = -77.2377, bottom = 38.8075, right = -76.9744, top = 38.9080), 
    maptype = "terrain",
    zoom = 12)
```

Next we will import the Bikeometer data from my database that contains the longitude and latitudes needed to plot the Bikeometers.

```{r message=FALSE, error=FALSE}
library(DBI)
library(dplyr)
con <- dbConnect(odbc::odbc(), dsn = "Bike_MySQL")
sql_cmd <- "SELECT * FROM counts.bikeometer_details WHERE bikeometer_id in (14,15,16,18,22,31,39)"
# creates a lazy table
bikeometer_table <- dbGetQuery(con, sql_cmd)
bikeometer_table
```

One Bikeometer stands out as not even being in Alrington... how did that get in there? Let's plot it anyway to see how close it is to Arlington. Realistically, I frequently bike into Alexandria on the weekends but I'll probably end up excluding it none the less.

```{r}
# Convert latitude and longitude to integers
bikeometer_table$longitude <- as.numeric(bikeometer_table$longitude)
bikeometer_table$latitude <- as.numeric(bikeometer_table$latitude)

# Plot the points on the map
ggmap(arlington) + # creates the map "background"
  geom_point(data = bikeometer_table, 
             aes(x = longitude, y = latitude, color = region), 
             alpha = 0.5, 
             size = 3) +
             theme_map()
```

From my experience, the most popular bike trail in Arlington will be on the Arlington Loop. I want to get a sense if these Bikeometers are on the Arlington Loop, and if not, choose Bikeometers that are on the loop. My hypothesis is that bike trails that weren't popular before the pandemic won't see much change after the pandemic. I generally either ride the Arlington Loop or the W&OD Trails as they are the most bike-friendly. A trail that is less bike-friendly before the pandemic won't become that much more bike-friendly during a pandemic, even if there is a reduction of traffic in the city.

I'll use the GPS coordinates from my Garmin GPS watch to create a layer for my plot that shows the Arlington Loop trail. I downloaded the gpx file from the Garmin Connect website.

![](images/GPX%20file%20from%20Garmin%20Connect%20Website.jpg "Garmin Connect GPX")

I used the websites [GPXStudio](https://gpxstudio.github.io/) and [MyGeodata Converter](https://mygeodata.cloud/converter/gpx-to-csv) to remove unnecessary data points and convert from GPX to CSV respectively.

I read in the CSV which I'll use to plot the Arlington Loop.

```{r}
arlington_loop_table <- read.csv('~/Github/Arlington_Bikeometer_Visualizations/Data/arlington_loop.csv')
arlington_loop_table
```

I passed the 'arlington' map object to ggmap to set the background image to Arlington then plotted the Arlington Loop using geom_point. I used the 'longitude' column as the x-axis, 'latitude' column as the y-axis, 'alpha' is the transparency of the points (0 to 1) and 'size' is the size of each point. I used 'theme_map' without 'x' and 'y' axes.

```{r}
ggmap(arlington) + # creates the map "background"
  geom_point(data = arlington_loop_table, 
             aes(x = longitude, y = latitude), 
             alpha = 0.5, 
             size = .5) +
             theme_map()
```

Next, I'll create an Arlington Loop layer so I can overlay it on top of the Bikeometer location map. I found two options to create this layer: geom_polygon or geom_path. The biggest diffrence I see is that you can specify a fill for geom_polygon, but considering I don't want to fill the layer in, I'll use geom_path.

```{r}
ggmap(arlington) + geom_polygon(aes(x = longitude, y = latitude),
                             data = arlington_loop_table, fill = 'blue', alpha = 0.1, size = 1, color = "red")
```

```{r}
ggmap(arlington) + geom_path(data = arlington_loop_table, aes(x = longitude, y = latitude), size = 1, color = 'red') + theme_map()

```

If I wanted the path to change color according to elevation, I could use the below \<color = ele\> argument. This will generate a legend on the map. By adding the \<theme(legend.background = element_blank())\> code, you remove the background from the legend. However I won't be using this code.

```{r}
ggmap(arlington) + geom_path(data = arlington_loop_table, aes(x = longitude, y = latitude, color = ele), size = 1) + theme_map() + theme(legend.background = element_blank())
```

I'll assign the geom_path to the arlington_loop_layer variable to be plotted on top of the plot with the location of the Bikeometes.

```{r}
arlington_loop_layer <- geom_path(data = arlington_loop_table, aes(x = longitude, y = latitude), size = 1, color = 'red')
```

```{r}
ggmap(arlington) + # creates the map "background"
  geom_point(data = bikeometer_table, 
             aes(x = longitude, y = latitude, color = region), 
             alpha = 0.5, 
             size = 3) +
              arlington_loop_layer +
             theme_map()
```

It looks like 1 or 2 Bikeometers are on the trail. Let's see which Bikeometers are actually on the Arlington Loop. I can use that data to pick some popular trails to visualize.

```{r}
sql_cmd <- "SELECT * FROM counts.bikeometer_details"
# creates a lazy table
all_bikeometer_table <- dbGetQuery(con, sql_cmd)
all_bikeometer_table$longitude <- as.numeric(all_bikeometer_table$longitude)
all_bikeometer_table$latitude <- as.numeric(all_bikeometer_table$latitude)
all_bikeometer_table
```

To figure out labeling, I used [this site](https://rafalab.github.io/dsbook/ggplot2.html).

```{r message=FALSE}
ggmap(arlington) + # creates the map "background"
  geom_point(data = all_bikeometer_table, 
             aes(x = longitude, y = latitude, color = region), 
             alpha = 0.5, 
             size = 10) +
  arlington_loop_layer +
  geom_text(data = all_bikeometer_table, nudge_x = 0.01, color='black', aes(x = longitude, y = latitude, 
                                         label = bikeometer_id, inherit.aes= FALSE)) +
  theme_map()
```

## Digression

For some reason, bikeometer_id 6 isn't in my database. I'll quickly convert my list of Bikeometers into integers so I can use the sorted() function in python to order them.

```{r include=FALSE, cache=TRUE}
library(reticulate)
```

```{python}
bikeometer_list = ['33','30','43','24','59','56','47','48','10','20',
                           '35','57','18','3','58','61','62','38','44','14',
                           '60','5','6','42','37','27','26','8','7','51','52',
                           '45','22','21','36','34','41','9','39','16','15',
                           '54','55','31','28','11','2','25','19']

# Convert to integers
bikeometer_list = [int(i) for i in bikeometer_list]
sorted_bikeometer_list = sorted(bikeometer_list)
sorted_bikeometer_list
```

With them ordered, I'll check to see which Bikeometers are in my database.

It looks like Bikeometers 6 and 52 are not in my database. When I request data from them for a few different time ranges, it looks like there is no data for either of them, which explains why they aren't in my database. One mystery solved!

## Digression Over

Right now, the bikeometer_id labels are ugly. If we don't use the 'nudge' parameter, it makes some a little more readable and some a lot less readable. My next goal is to make the Bikeometer locations easier to read and see so I can pick the ones that are on the Arlington Loop. Ideally, each point would have an arrow pointing to its location with a label at the other end of the arrow. I wonder if there is a package that can make something like this...
