“Regional Earth Observation Foundational Models: Improving Representation of Domain-Specific Patterns”
Gilberto Camara, Felipe Carlos, Rolf Simões, Alexandre Assunção, Felipe Souza;
Oral talk
EO foundational models transform satellite images from a space-time grid of raw values into high-dimensional latent spaces called embeddings. These embeddings encode relationships between pixel values and the corresponding biophysical characteristics. Seasonal crop phenology (plant life cycle events), urban patterns, and forest canopy texture are each represented in different combinations of embedding dimensions. Researchers use these embeddings to train lightweight, downstream models for specific tasks, such as LULC (land use and land cover) classification, biomass estimation, or deforestation detection. These tasks require only a fraction of the computational power and labelled data.
The trend is to build massive, global-scale foundational EO models (such as TESSERA or AlphaEarth). Nevertheless, there is a strong case for developing dedicated regional foundational models. Global foundation models inherently seek universal statistical patterns, pushing representations toward generalised, highly simplified categories. A regional foundational model avoids this homogenization by optimising representations for local landscapes. By pre-training a foundation model on regional Earth observation data cubes, the latent space represents those specific regions. This prevents the model
from importing spatial biases learned from entirely different continents, resulting in much higher-quality embeddings for local downstream tasks.
This presentation will show how to build regional EO foundational models using an easy-to-use API associated with the R/Python package SITS. Users can merge various sources, such as optical, radar, topographic, and climate data. The resulting EO embeddings will be better suited to regional applications than global products.