Introducing: an API to download from OpenStreetMap

Brendan Ashworth

March 14, 2023

OpenStreetMap horizontal image

Today we're making our API to download data from OpenStreetMap public for the first time. It's the easiest way to extract features as GeoJSON from OSM and is another step towards making it easier to build complex applications with geospatial data.

We're offering access for free to most developers and are happy to support large volume workloads (>10 million features/month) for an infrastructure fee (starting at $49/month).

Downloading features from OpenStreetMap is hard

OSM is a massive project: with over 10 billion tagged points and polygons in every city and country in the world, it probably represents the largest hand-crafted vector geospatial dataset. Everything from transit lines to farm land is labelled and stored in one gigantic database: the planet file, Planet.osm.

The planet file, and 1.7TB of complexities

The planet file contains all OpenStreetMap features and takes up 1710 GB uncompressed, which is 3x larger than the entire bitcoin ledger. Even when using a compressed, delta-encoded protobuf format, the planet file takes up 67.7 GB of space. Unlike most geospatial data formats (which adhere to the Open Geospatial Consortium's "simple features"), OpenStreetMap organizes features to have many dependents, splitting semantic features into separate locations.

protobuf schema for OpenStreetMap

OpenStreetMap features depend on other features in a super-relation > relation > way > node hierarchy. For example, a hospital complex (amenity=hospital) could be tagged as a relation, where each building is a component way inside that relation, and the coordinates of each building corner are separately stored as a nodes.

hospital relation in OSM

A building complex with 100 coordinates likely requires 100 seek + read operations on the planet.osm file, making random access in this file format expensive. Additionally, nodes, ways, and relations are stored in separate, compressed PrimitiveBlocks each containing 16MB of data, requiring these steps to retrieve a single node's coordinates:

  1. Find the PrimitiveBlock via binary search over node ID
  2. Seek to and read the PrimitiveBlock from disk (planet file is too large to stay in memory)
  3. Decompress (with zlib, although lz4 is supported) the 16MB PrimitiveBlock into memory
  4. Delta-decode the node IDs, latitudes, and longitudes (DenseNodes are delta-encoded to save space)
  5. Lookup the target node ID

These constraints are inherent to the file spec and lead to common frustrations. For example, the open source export CLI Osmium creates an index for node locations that requires 66GB in RAM (before processing!), often crashing huge servers with Out of memory 2 days into an osmium export command.

Using Overpass API in production

The next best way to download from OpenStreetMap is Overpass API, a cluster of community-contributed servers that implement Overpass QL, a specialized query language. It's accompanied by the open source Overpass Turbo, an (extremely useful!) interface for exploring OSM data.

Overpass Turbo in Sydney, Australia

The Overpass API, however, is difficult to use for developers new to the OpenStreetMap ecosystem because of throughput throttling, query complexity, and usage limits. Queries time out beyond 180 seconds and users are asked to limit to 10,000 queries per day (and >5 GB data per day).

This is a double edged sword: Overpass API has extremely powerful features like pipelines of search criteria that let you search for features near other features. But many geospatial developers just want to download OpenStreetMap features as a GeoJSON and render them on a web map!

Introducing: an API to download from OpenStreetMap

Simply put, our new API lets you filter OpenStreetMap features and download them as GeoJSON. You can do this with a single GET request:

curl --get 'https://osm.buntinglabs.com/v1/osm/extract' \
     --data "tags=aeroway%3Drunway" \
     --data "api_key=YOUR_API_KEY_HERE" \
     --data "bbox=-112.162,40.459,-111.791,40.876"

This downloads all airport runways near Salt Lake City (the bbox parameter is optional and can be removed to download the planet's runways). Here's a sample of the output GeoJSON, which can be piped into jq, loaded into MapLibre, or imported to your favorite GIS software:

{
  "type": "Feature",
  "properties": {
    "aeroway": "runway",
    "length": "1412",
    "ref": "17/35",
    "surface": "asphalt",
    "width": "21"
  },
  "geometry": {
    "type": "MultiLineString",
    "coordinates": [[
      [ -111.9274648, 40.861828599999996 ],
      [ -111.9274648, 40.861968399999995 ],
      ...
    ]]
  }
}

Ways to use this new API

Some popular use cases for extracting features from OSM that could be useful to developers are:

  1. Downloading OSM data for geopandas analysis: especially useful to geospatial data scientists and urban planners
  2. Importing OSM features into a PostGIS table: useful for lookups like nearby trails and amenities in a geospatial app backend

We also have an overview of using OpenStreetMap data in geospatial apps, which is a good overview for beginners.

What's now easier with OpenStreetMap

We hope this API is so easy to use it enables new and interesting use cases on top of OpenStreetMap. Here's a short list of new technology we think could be developed with easy extracts:

  • serverless geocoding (like nominatim)
  • client-side (JavaScript) routing
  • dynamic vector tiles (without running tippecanoe on an entire extract)
  • contextual data like our neighborhood description API

If you thought this geospatial product announcement was interesting, you can also follow us on GitHub for our open source drops and Twitter to watch us build! We're always sharing new things about GIS and geospatial development.