Community Lightning Talk, Marc Farra – Transcription
So next up. I don’t think Robin is here for the presentation. So I think Marc is next. [ Applause ] I’ll let everyone just introduce themselves so I can be out of the way. So I just start, then? Just talk about myself? Talk about yourself all you want. I’m Marc. Where’s your presentation? Right. Okay. So as before, I’m Marc. I work at Development Seed. And I want to talk a bit about our thoughts on machine mapping and validation. So machine mapping, we’ve heard a lot about it during this conference. Especially yesterday during the talks in the morning. And it can be very scary, right? Machine mapping. Whooo! But fundamentally I want to raise the issue that maps are still human objects. And robots make mistakes. For example, this is an output of our own machine learning algorithms. This is Skynet, if you have ever heard about it. So it matches the features on the ground. It does road detection. But this is an example of bad output. It saw that grid and it just decided to make squiggly lines. It doesn’t match anything on the ground. But we still need a human in the process to correct for that and validate the machine output. So if we can treat machines like novice mappers, we can start building validation by plans for them. We can even personify machines. If they take feedback from the ground, we can say, “Alice is a machine mapper that’s great at mapping Seattle. But Bob is a machine that maps out Thailand very well.” So let’s talk about feedback loops. In a validation pipeline, you have to measure a behavior. You have to correct for that behavior, and then you have to give feedback to the mapper. Whether machine or human. So in measuring error, we already have tools in the community. We have OSM Shaw, and the OSM Shaw developers were here. They made a they made a great changeset analyzer. It can look for vandalism, like Pokemon Go edits and flag them. Two Fix is also by Mapbox. It looks at featurelevel problems like roads across buildings. And maybe in the future we can look at machine learning algorithms that detect how humans are mapping and create rule sets for them to detect these errors early on. And then we can have machines pit against each other. And I guess that will be fun. Next is correcting for error. Facebook showed off a version of ID which was great. It had linting errors built in. We’re building a linter from scratch called Skynet Scrub, or Scrub for short. It is fundamentally a cleaning tool at machine output. But it can also be used for human mappers. The gold standard right now is mappers correcting other mappers using the task manager validation. Finally, we have feedback. Part of that loop. So this is an existential question. What makes a good mapper? It’s very subjective. But we try to tackle it, for example, with some stats for missing maps with badges and leader boards. We started tracking total number of edits, total number of buildings, kilometers of roads mapped. That’s really good for motivation. That’s really good for bringing in mappers. It, you know, you get a shiny badge that you’re a super building mapper. That will bring you back to the map. But it’s bad for feedback. It doesn’t tell you you’re a good mapper. That feedback could be how many of your edits were rejected. But we still don’t have that moderation level at a fundamental level in the infrastructure. Yesterday, we had a validation birds of a feather where we started talking about reputation systems. Moderators and levels and some things are not going to work well with the mission of the OSM community. But we’d like to hear your thoughts. If you are interested about these things, about how to formalize validation workflows for both machine and human mappers. Reach out to me and we can talk about this. These two links on the board are machine learning model called Skynet. And that cleaning tool. We’d like your feedback on what sort of validation can go into that tool. And that’s pretty much it. It’s short. It’s a lightning talk. Robots can dance. Is there [ Applause ] At 4:25, if you like OSM stats, we’re going to be talking more stats at 4:25. It’s birds of a feather. I guess. Meet in that room too. Okay. Is there question time? Later. Later. Okay. Cool. Thanks, Marc.