The Linux Foundation Projects
Skip to main content

Following the release of our first Beta release we are very happy to announce that, with support from CARTO, it is now possible to access all Overture data in Google Cloud BigQuery and Snowflake natively. In BigQuery, the data is available on the BigQuery Public Data Project and in Snowflake they are available as a free listing on the Snowflake Marketplace. For detailed instructions on accessing the data, please see CARTO’s site here.

This will include all of the data themes for Overture data. The schema and structure of the data are maintained as close to the original schema of the official releases as possible. The data will be updated as frequently as the Overture data (currently monthly) although with a slight delay from the initial release date. As with the data itself, this should be considered to be in beta, and CARTO is interested in companies trying the data and giving feedback.  

Since the data is published as data shares in BigQuery and Snowflake, you can link directly to them on your SQL without having to do any ETL. Users won’t have to pay for the data storage…only for computation done on the data. CARTO will be updating the data with every official release so the data will be refreshed at the same rate as Overture.

The archive is not being replicated, and if you want to access that you will still do so in the official distribution.

Why this matters

Overture Maps data published in GeoParquet files is currently hosted on AWS S3 and Azure blob storage. This makes it easy to access the data by querying it or pulling the specific parts of the data needed for analysis. Users are able to select specific feature types such as buildings or roads and pull the data within specific bounds or areas.

However, working with this dataset at a planetary scale can be difficult. The data in Parquet form is about 200 gigabytes. When expanded, it can get bigger. Organizations wanting to use the complete dataset would need to move that data from the Overture Maps source datasets into their cloud system of choice.

By using the CARTO data shares to access this data, organizations now only have to call the dataset to access it and can use it in conjunction with their own organizational data. The use cases for this are nearly endless using Overture’s worldwide data from one location.

Overture data for analytics

Overture started with the idea to create reliable, easy-to-use, and interoperable open map data as a shared asset that any map service provider or developer can use to power richer mapping services. Beyond map service providers, the data in Overture can drive geospatial analytics across many use cases. We are seeing growing interest in using the data in combination with other datasets. Now, with the Overture data available in BigQuery and Snowflake, building these data pipelines just became much easier.

For example, CARTO utilized Overture data to calculate flooding risk profiles for buildings by intersecting Overture buildings data with flooding forecasting to power insurance companies.

Thank you CARTO

Many thanks to CARTO for doing the work to create this publication and committing to maintain it. They offer analytics and mapping capabilities on top of BigQuery and Snowflake, including a tiler to process large amounts of spatial data. Please check out their work. 

In the meantime, we invite you to get started with our data and share with us your comments and feedback.