2025-09-01, 13:30–15:00 (Europe/Amsterdam), HugoTECH
As vector data size increases, the demand of cloud computing is also getting more attention. This lecture aims to tackle the big vector data problem and how to tackle it in the cloud environment. The introduction of cloud-native vector formats such as Flatgeobuf, GeoParquet and PMtile. We will have the hand-on visualization and process the big vector dataset such as ICESat-2 and GEDI. The introduction and hand-on will combine with explanation of Vector
Tile and Lazy Loading theories behind.
The first half is the lecture about the theory and algorithm behind the cloud-native format and spatial indexing. The duration is about 30 minutes. The second half is hands-on using DuckDB, Polars, and other python packages to work with cloud-native format. In addition, we also will go through the parallel processing for big vector data, including parallelization using semaphore for writing in a single big file and partitioning big data into smaller chunks in parallel.
Other links:
https://docs.google.com/presentation/d/1B-Z7PPErQfGBqlhqdBn1tJ0CRov9e2XJWHi_Bio1QwA/edit?usp=drive_link
The tutorial is co-hosted by Yu-Feng HO and Serkan Isik
Research Assistant | Geoinformatician at OpenGeoHub Foundation, focus on big data analysis and processing