Crowd-Sourced Data and Cycling – Transcription

All right. Let’s get mine up. Oops I think I took the last slide. Oh, it’s started. I think we just need to reopen it. You tell me when. I’m ready. Yeah? Okay. Hi, I’m Martha Morrissey. I’m a graduate student here at CU Boulder in the department of geography and today I’ll be discussing my research about crowdsourced data and cycling. It was really important to understand cycling because it has significance for health, environment and policy. On a recent survey of U.S. cities came out and there was a correlation found between the happiest cities and the cities with the most bike lanes. So cycling is important. So traditionally cycling was understood through manual cycling counts where people were sitting at key intersections throughout cities and counting during peak morning and afternoon commute hours. But this is not happening every day of the year. This is happening on a quarterly basis or sometimes a monthly basis. But now there’s emerging data sources that allow for a lot more data. People can record any bike ride or run they do with a smartphone or a GPSenabled device. So in 2015 alone there were over 170 million cycling and running trips recorded. This is a heat map for the city of Chicago showing the trips that were used. So Chicago so Strada is a way to augment these cycling data sources used with manual counts. So specifically I’m researching how can cycling flows in specific city quarters be understood and modeled through crowd source data and traditional data sources. Another data source I’m really excited to be working with is OpenStreetMap data. And specifically from OpenStreetMap data I’m working on the road speed limits in the cycling infrastructure. And what I really like about OpenStreetMap, it tells the user if the infrastructure is on the road or separated in a path. And there are key factors include population density, topographical data, slope, the bike share and how much they’re used, bike infrastructure and, again, the count datas. So first I’m working with the city of Chicago. Chicago biking is already pretty wellestablished here. There’s over 200 miles of bike paths. And by 2020 they’re set to expand that to 645 miles. This is an interesting city look at. I started by replicating a study from 2013 to understand the relationship between crowdsourced count data and manual count data. So this model our results were consistent with the other study’s results. Which was good. We found a correlation of about .8 between the crowdsourced and the manual cycling data. And month cycle type were significant predicters. But this model isn’t the best way to approach this problem. It’s a good start. It’s an oversimplification because it treats the streets that people are cycling on as separate, but they’re interconnected. And that makes sense when you look at this map. We see the colored pink lines. The darker the pink means the more cyclists that have been on the road. There’s a partner and the GLM is not capturing that pattern. It’s important to think spatially. When you’re thinking about cycling, it’s important to look at clustering. We see the clustering counts. This is one neighborhood in Chicago. But we found statistically significant clustering and look to the entire city. And you need to think temporally. Cycling ebbs and flows throughout the day. We care about the peak morning and peak afternoon hours and also how to adjust seasonably. Most don’t cycle in the winter like that brave man. The data allows us to combine space and time. The model should address those. We should think about space and time together to account for both of those in the third dimension of time. Our next steps are working with more advanced models such as conditional, autoregressive models where that weight matrix can go in as a term. We want to work with recurrent neural networks because they have been successful in the field of traffic predictions. Once we get more models working, we want to expand it to smaller cities, Farming Dale, New York, St. Petersburg, Florida to look at cycling in a variety of places. I want to close out emphasizing how powerful crowd source data is and it has a place shaping future policy decisions such as bike infrastructure. I’d like to acknowledge and thank Strada metro for data access. People with bikes for helping me get data access, the CU earth lab and my adviser, Dr. Carson Farmer. [ Applause ]