Overture Maps recently hosted an informative webinar titled “Working With Overture Data: A Step-by-Step Guide” featuring a core member of the Overture team – Jennings Anderson, Ph.D., Research Scientist at Meta. The webinar covered Overture’s initial data release which came out in late July, the decision to adopt the cloud-native Parquet format for data storage, and demonstrations on harnessing its capabilities for various applications, particularly in map visualizations.
For those interested in a deep dive, we highly recommend watching the full webinar recording for practical demonstrations and insights. Watch the Webinar Here.
Overture Data Cloud-Native Parquet Format
One of the webinar’s highlights was the in-depth discussion on how to access and utilize Overture data. Jennings demonstrated the use of Amazon Athena as a platform for query-based data access. He emphasized that the data is stored in Parquet files, allowing users to manipulate and download the data based on their specific needs. He further explained that Overture Maps uses the cloud-native Parquet format for several reasons:
- Efficiency and Scalability
The Parquet format is optimized for efficient storage and high-performance data processing. This makes it ideal for handling Overture’s large datasets, which are currently about 200 gigabytes in total if downloaded entirely.
Users don’t have to download and process the massive files to start using them. They can query specific data subsets directly from the cloud using platform tools like Amazon Athena, Microsoft Azure Synapse, or DuckDB. This enables users to transform, download, and analyze the data without having to manage large files locally.
- Versatility in Data Conversion
The Apache Parquet format allows for seamless conversion into various other data formats based on the user’s needs. During the webinar, it was demonstrated how Overture data could go from Parquet in the cloud to local CSV, GeoJSON, shapefiles, and even directly into map tiles for rendering.
- Enabling Data Integration via GERS
The upcoming Global Entity Reference System (GERS) will assign unique, stable identifiers to features across different themes in Overture. This will make it easier for external data providers to match their data to the Overture corpus, potentially broadening its applicability to other areas, such as in-vehicle Advanced Driver-Assistance Systems (ADAS) applications.
Step-by-Step Guide to Accessing Overture Data
Jennings then transitioned to the hands-on segment of the webinar, where he walked through the process of accessing and manipulating Overture data.
If you’d like to revisit the key points covered in the webinar, the presentation deck is accessible here.
He went on to provide live demonstrations, showcasing how to query the database to obtain specific data subsets. Using tools like Kepler for data visualization and QGIS for spatial analytics, he showed how the data could be converted into various formats such as CSV, GeoJSON, shapefiles, and even map tiles. This hands-on approach gave attendees a practical understanding of the data’s utility.
For a real-time walk-through of these steps, don’t miss out on watching the webinar recording. Watch the Webinar Here.
What’s Next for Overture?
Jennings wrapped up the presentation with a look at what’s on the horizon for Overture Maps. The focus will continue to be on improving data coverage and quality. Moreover, the introduction of the Global Entity Reference System (GERS) is imminent. This system aims to assign stable IDs to features across various themes, making it easier to match external datasets and enabling ID-based conflation.
The Q&A session from the Overture Maps webinar touched on various key topics that offer a deeper understanding of the project’s scope, capabilities, and future direction. Here are some of the key takeaways:
Confidence Scores in Places Data
Several questions were around the confidence scores associated with Places data. Jennings explained that a score closer to zero indicates that a place no longer exists, while a score closer to one indicates high confidence in its existence. The scores are not only calculated based on a combination of factors including validation across multiple sources and social media check-ins.
The data is primarily released in Parquet files, and users wondered about the possibility of GeoParquet in future releases. While GeoParquet is on the roadmap, the initial focus has been on releasing the data in a format that allows for maximum customization by end-users.
Licensing in Overture Maps Data
Marc Prioleau, Executive Director of Overture Maps, addresses several questions about data licensing in Overture Maps. Each data theme has a specified license, either an Open Database License (ODBL) if the data in that theme comes from OpenStreetMap or the Community Data License Agreement (CDLA), Permissive Version 2.0 if it comes from other open sources. For ODbL-licensed data, it is important that users understand and apply the conditions of that license.
Will there be an API?
Overture’s focus is on building map data. We do not intend to build an API. However, we anticipate that various third parties will offer Overture-based map services through appropriate APIs in the future.
How can contributors add to the height data of buildings?
The building theme uses OpenStreetMap along with other sources. Individual mappers are encouraged to make edits to buildings (and other map features) through the various editing tools for OSM.
Whether you’re a seasoned professional or just starting out, the webinar provided a comprehensive overview of working with Overture data. The step-by-step guide, coupled with real-time demonstrations, offered a practical approach to mastering data query and visualization using Overture’s datasets. With Overture continually expanding its capabilities and dataset offerings, now is a great time to dive into it.
Here is the link for those who want to see these insights in action. Watch the Webinar Here.