Community Lightning Talk, Brian DeRocher – Transcription

Brian is next. Talking about voting districts. Let’s see your slides are oh. Did you find it? Yes. It’s this one here. Full screen. Cool. All right. So who is familiar with the concept of gerrymandering? All right. And who’s really pissed off of about it Thank you. Really chaps my ass. So it’s pretty bad in Virginia, where I’m from. And I was working I’m going to tell a story about a side project I did. I came across two things. One is the census dot map. Anybody seen the dot map? The racial dot map. It’s cool. Check it out. And something I was doing at work is K means clustering. So I took these two ideas and I thought, well, we can bring these together and we can generate congressional districts. And kind of naively, I took a stab at that. I’ll let you read this, because it’s just funny. All right. So there’s two parts of the project. One was to get census data and scale it down just for performance reasons. I did one in a thousand points. And then distribute people or households proportionately across the state. And then the second part is to use a K means algorithm to build the districts. So here’s the census.map. It’s pretty need. This map has only people on it. And nothing else. You get to see some of the infrastructure just pop out. And the density. Now, who is familiar with K means? This is pretty cool. Those not familiar with K means, means means average. K averages. So we see six averages. And the algorithm, it’s not fast, but not slow. It’s an iterative process where you put the middles of these districts and kind of migrate into position. I’m not since so many people are familiar with this, I’m just going to skip over the details. So what I came up with was this map. And it built 11 districts. These are U.S. congressional districts. It’s not very clean. I just put a context hull around it. So it’s not the borders aren’t very well done. But it is possible. And one of the things that it there’s a mistake in this map is that for this district right here, it goes across the water. That doesn’t make a lot of sense to me. The problem is that it was using the Euclidean distance. You can do better, use driving distance or public transportation. Or drop in OpenTripPlanner in there and see multimodal transportation. So that’s what I started doing. I’m honestly not done with that yet, still making improvements. I’m using PG routing. Which is the library in postgres. One of the nice things that PG routing does is it solves the problem from generating the routing from thousands and thousands of points to a smaller subset, say, 11 points. And it does that in a single pass. It doesn’t have to find like the route for person A and person B and person C and person D. It reuses the knowledge that it gained from the first person in generating the routes to the successive people. So it’s pretty fast. And that’s all I have for now. But if you’d like to contribute, there’s my project. And thanks a lot. [ Applause ] That’s cool. And it’s promising. I’m curious to see the results of that when you do that routing stuff.