Open Earth Monitor — Global Workshop 2023
The era of open Earth Observation (EO) data started 2008 when the United States Geological Survey (USGS) made the Landsat archive available free-of-charge. Since then, the amount of open EO data has increased exponentially, also due to Copernicus, the European Union’s Earth Observation programme. The current wealth of open EO data available is unprecedented. This leads to the current situation that a considerable fraction of the open EO data produced and disseminated on a daily basis is not used as users cannot access, process and analyse the data. Questions on how EO data can be utilised to better support the Green New Deal and related communities, such as Renewable Energies, lead the discussion now and will do so in the coming years. A better understanding of the needs and requirements of different users (from EO data users to policy- and decision-makers) will be vital in shaping the future of open EO data.
“To create the future, we must understand the past” is a famous quote stated by astrophysicist Dr. Carl Sagan. Hence, in this keynote talk, I would like to take the audience on a time travel through the era of open EO data. We will take the perspective of an EO data user and first identify key milestones and developments since 2008 before we draw a more detailed picture of what it is like at the moment to discover, access, process and retrieve knowledge from open EO data. After we shed a light on the past and the present, time travel continues to the year 2030 and together with the audience, I’d like to develop a wish list on how open EO data shall be of value for different stakeholders in the future and extract the key requirements that are needed to achieve this.
Open Science is increasingly recognized as a catalyst for innovation. Back in 2016, the EC's DG-RTD laid a vision for European R&D which acknowledged that “the way that science works is fundamentally changing, and an equally important transformation is taking place in how companies and societies innovate. The advent of digital technologies is making science and innovation more open, collaborative, and global”. The concept of Open Science and Innovation is embraced by the European Space Agency in
its Agenda 2025, recognizing the value that such principles of innovation can bring for the space sector in terms of optimizing development cycles, accelerating time to market, and reducing cost.
In this talk we will present ways in which Open Science and Innovation are addressed in ESA's Earth Observation Programme, and how such principles are transferred into R&D and Scientific activities. We will dive into the "WHYs" and "WHENs" of adopting openness in the Earth Observation value chain, share lessons learned from community consultations and finally look into the "HOWs" of implementing Open Science and Innovation in EO building on elements of agility, sustainability and scalability provided by incorporation of new digital technologies and design-driven product development and science for society.
The Group on Earth Observations (GEO) envisions a future where decisions and actions for the benefit of humankind are informed by coordinated, comprehensive and sustained Earth observations. A central part of GEO’s Mission is to build a Global Earth Observation System of Systems (GEOSS), a set of coordinated, independent Earth observation, information and processing systems that interact and provide access to diverse information for a broad range of users. The amounts of Earth observation data are increasing drastically over the last years, as they include different data sources from satellites to model outputs, from airborne sensors to ground stations and in-situ data. The main challenge is how to find and make use of these resources by users worldwide.
The European Space Agency, together with the Italian National Research Council and the University of Geneva are contributing to the implementation of GEOSS via the EU H2020 co-funded project GPP (GEOSS Platform Plus). This European contribution is reinforcing the use of Earth Observations globally focusing on provisioning of actionable information for climate change research, monitoring, and development of mitigation and adaptation actions. A user centric approach helps to focus on real user needs, listening and co-designing new implementations in close coordination with users. Examples of applications targeted include SDG 15.3.1 on Land Degradation, Climate Change impact on Norovirus Pandemic Risk, and SDG11.7 that relates to climate change, urban sustainability and health. Access to data products, services and information, and the possibility to generate actionable information to derive results as input to decision makers are main objectives. Relevant contributions to an evolved overarching GEOSS architecture that copes with changes in the GEO landscape, including general changes in the Science/Policy landscape in technological innovation are foreseen as well. In this context, GPP is playing an important role in connecting application developers (scientists, developers) to providers (of data, platforms, services, etc.) for enabling the generation of actionable information usable by the end users (decision makers, institutions, citizens). It will be then presented some real cases of how GEO communities and providers could contribute to GEOSS and how users can benefit from the tools and applications available in the GEOSS Ecosystem to support the different actors in different GEO incubator areas (also known as societal benefit areas).
Landsat is the longest running program to provide space-based data for Earth’s land surface. Based on nine satellites, the program has been monitoring the planet since 1972, consistently providing multi spectral images for several applications. Due to technology differences among the satellites / image sensors, the reflectance values may have significant variations across the entire time-series. Data gaps, due cloud cover, and stripe artifacts, caused by the Scan Line Corrector failure (Landsat 7), add an additional level of complexity for the users interested in perform long-term time series analysis and machine learning on this data. Considering these challenges and the potential of usability of the Landsat imagery, here we presented a workflow to produce analysis-ready and cloud-optimized (ARCO) global mosaics including: 1) data harmonization, 2) cloud and artifact screening, 3) temporal aggregation, 4) gapfilling, and 5) mosaicking. Relying on de-facto standards (Cloud-Optimized GeoTIFF - COG and SpatioTemporal Asset Catalog - STAC), the Landsat global ARCO mosaics have potential to boost the access of Landsat data and contribute with the monitoring of land use conversion, food production, biodiversity, climate change and land productivity.
Reproducibility and Reusability of workflows are increasingly important topics in Remote Sensing research when moving towards FAIR and open data science. This workshop discusses the current status quo, and how we can improve this with future activities.
The Copernicus Data Space Ecosystem (CDSE) represents a key milestone in access to Copernicus satellite data. First and foremost, the novelty relates to the paradigm shift that all Copernicus data (except for some of the raw data) is immediately available online - global coverage and the entire time range including the archive - at no cost, for any user. The product list includes Copernicus satellite imagery (Sentinel-1, Sentinel-2, Sentinel-3, Sentinel-5p), Copernicus Services and other satellite data missions (e.g. Landsat, SMOS, Envisat). In the orderable mode, historical Sentinel-1 RAW data and processing of Sentinel 1/2/3 data using official ESA processors is available. So-called Sentinel Engineering Data (mostly Level-0 data) will be available in the rolling 2-week archive. Moreover, CDSE will offer access to commercial satellite data.
The main advantage of immediately available data is that the user does not have to order and wait for the data. Direct access also allows bulk data processing and streaming, e.g. via OGC services (WMS/WFS). In this respect, only online data access provides sufficient capacity to visualise data online. Another advantage of immediately available data is the ability to partially read large data files if they are stored in an optimised chunked format such as Cloud Optimised GeoTIFF (COG) or Zarr for rasters, or GeoParquet for vectors. Partial reading is essential for parallel computing, which allows small chunks of data to be processed in parallel so that there is no need to wait for the data to be fully loaded before processing.
Another novelty of CDSE are the various interfaces where the data are available: from old-fashioned download to various interfaces providing capability to search the catalog connected to the same database to guarantee consistency. First interface is OData - an ESA-adopted standard, which is based on https RESTful Application Programming Interfaces. It enables resources, which are identified by URLs and defined in a data model, to be created and edited using simple HTTP messages. Another interface which is foreseen is the STAC catalogue and API that become a standard in the EO community, also being onboarded to OGC. The CDSE also provides Jupyter Hub - a very suitable tool for prototyping, developing, and testing applications for Earth Observation data processing. This is an open-source, online, interactive web application which gives access to computational environments and resources without burdening the users with installation and maintenance tasks.
The vast majority of the described capabilities are available free-of-charge for the individual's use - personal, research or commercial. For those interested in larger scale processing, there are practically unlimited resources available under commercial terms. The first one of these is CREODIAS, which allows user to access and process the data directly from federated cloud environment, order serverless processing of the EO products and access EO-dedicated services. Yet, additional third-party providers are joining CDSE to offer a variety of additional services (free and commercial) as members of the Data Space.
The available information for monitoring the Earth has never been as abundant and accessible as today. Data volumes from Earth Observation, mathematical models and in situ measurements will continue to grow in the future. While this data richness opens unprecedented possibilities for monitoring and developing a holistic understanding of our planet, it poses challenges with respect to efficient data exploitation and fosters novel technological approaches for the joint exploitation of the different data streams. Despite considerable efforts for standardisation by different institutions in the past, data formats and models as well as interfaces for data access remain diverse, resulting in costly, bespoke solutions for finding, accessing, and processing heterogeneous input data. The xcube open-source Python package addresses such requirements and offers a suite of comprehensive tools for transforming data sets into analysis-ready data cubes. As it is designed as a framework, it is continuously extended to cater to changing needs. It integrates seamlessly into Python’s data science stack, as advocated by Pangeo, and extends it by specific data stores for the access to relevant data sets for Earth System Science and tools for data provisioning and exploitation.
Spatiotemporal data cubes are becoming ever more abundant and are a widely used tool in the Earth System Science community to handle geospatial raster data.
Sophisticated frameworks in high-level programming languages like R and python allow scientists to draft and run their data analysis pipelines and to scale them in HPC or cloud environments.
While many data cube frameworks can handle harmonized analysis-ready data cubes very well, we repeatedly experienced problems when running complex analyses on multi-source data that was not homogenized. The problems arise when different datasets need to be resampled on the fly to a common resolution and have non-aligning chunk boundaries, which leads to very complex and often unresolvable task graphs in frameworks like xarray+dask.
In this workshop we present the emerging ecosystem of large-scale geodata processing in the Julia programming language under the JuliaDataCubes github umbrella.
Julia is an interactive scientific programming language, designed for HPC applications with primitives for Multi-threaded and Distributed computations built into the language.
We will demonstrate an example analysis where data from different sources (global fields of daily MODIS, hourly ERA5, high-resolution land cover), summing to multiple TBs of data, can interoperate on-the-fly and scale well when run on different computing environments.
The European Union's Green Deal is a transformative initiative aimed at achieving climate neutrality and sustainable economic growth by 2050. To address the complex challenges posed by climate change and environmental degradation, a robust infrastructure of platform services has emerged to support the Green Deal. At EODC, we implement and develop key components of this infrastructure, including the openEO Platform, ESA GTIF Austria, STAC, Pangeo, and the GREAT project. These services and projects play a pivotal role in gathering, processing, and disseminating critical environmental data and insights that drive policy formulation and sustainable practices.
The openEO Platform serves as a cornerstone for the Green Deal by providing standardized and open interfaces for accessing and processing Earth Observation (EO) data. Leveraging cloud computing and distributed resources, openEO enables researchers, policymakers, and businesses to harness the power of petabytes of environmental data. By fostering collaboration and interoperability, openEO supports the development of innovative applications and services that contribute to the Green Deal's goals, such as carbon monitoring and land-use planning.
The SpatioTemporal Asset Catalog (STAC) standardises the organisation and discovery of geospatial assets, making it easier to find and use EO data. By embracing STAC, the Green Deal ecosystem and platforms ensure that valuable environmental information, such as satellite imagery and climate models, can be efficiently located and employed in various applications, including environmental monitoring and disaster risk management.
Pangeo, an open-source platform for scalable Earth science, empowers researchers with tools and resources for analysing massive climate and environmental datasets. By offering Jupyter notebooks, cloud computing, and data catalogues, Pangeo facilitates collaborative research and data-driven decision-making. Pangeo's contributions are vital to the scientific underpinnings of the Green Deal, assisting in climate modelling, biodiversity assessment, and ecological forecasting.
ESA's Green Transition Information Factory (GTIF) demonstrates in Austria how to enhance the visibility and access to EO-derived datasets and knowledge for decision-makers, policymakers and citizens. The platform additionally showcases how to efficiently disseminate information for Austria's green transition initiatives allowing users to discover the potentials of transitioning to carbon neutrality by 2050 through EO data and cloud computing.
The GREAT project is focused on building a robust data infrastructure for the Green Deal. It aims to integrate data from various sources, including EO, environmental sensors, and socioeconomic indicators, into a comprehensive data space. This unified data repository enhances data sharing and accessibility, supporting the monitoring, evaluation, and adaptation of Green Deal policies and actions.
In conclusion, these projects form a cohesive and adaptable infrastructure that underpins the Green Deal's ambition to combat climate change and promote sustainability. By providing open and standardized interfaces, advanced geospatial capabilities, efficient data organization, scalable computing, and comprehensive data integration, these platforms facilitate data-driven decision-making, innovation, and drive progress toward the EU's environmental goals. In the face of the global climate crisis, these services are central tools in advancing the Green Deal's mission to build a greener and more sustainable future for all.
In tandem with the monumental increase in geo-data availability from remote sensors, field sensors and various publicly available environmental datasets, state-of-the-art geoinformatics algorithms have evolved to harness earth science data as never before. In the field of computational hydrology, these processes have yielded global information in fine detail, and of exceptional precision.
Hydrography90m is one such data product that pushes the boundaries of computational hydrology in several ways. It is a globally standardised and seamless hydrographic dataset that allows the mapping of headwaters in unprecedented density and detail. With the minimum upstream contributing area set at 0.05km^2, it comprises the highest density of headwaters compared to leading global hydrographic assessments. The dataset contains 1.6 million drainage basins and 726 million stream segments and sub-catchments. It is also designed to overcome the spatial and accessibility constraints of gauged locations and address the limitations of spectral analyses.
As for applications in scientific research, Hydrography90m is well-suited for both global and comparative area-of-interest studies. The dataset contains many essential stream features, such as stream slope, stream distance, types of stream order and flow indices. Hydrography90m thus offers significant utility in the assessment of freshwater quantity and quality, inundation risk, biodiversity and conservation, and resource management objectives, all in a globally comprehensive and standardised manner.
In terms of the underlying computational approach, Hydrography90m is based on a drainage flow algorithm that distributes downhill water flow in a realistic manner, following the concavity and convexity of terrain. Additionally, programming in a variety of open
source software provides unmatched computational power, and the implementation of different scripting procedures allows for bench-marking strategies to check for potential errors. Software employed includes GDAL, Pktools, and GRASS GIS.
The novel computational approach of Hydrography90m broadens the scope for using various remote and field sensor technologies, and the scripting procedure lays the foundation for more complex Machine Learning-based discharge assessments. Its design is a pivotal development for addressing the challenges of overfitting and universal coverage in hydrological modelling. Machine Learning can now enable the massive data integration that is vital for global scale hydrological studies; and hydrographic data with fine detail of headwaters provides an excellent foundation for interbasin connectivity and high-resolution discharge predictions. Meanwhile, other data-driven and ensemble methods that have emerged recently to address these technical challenges still remain limited as tools for basin-specific studies.
Given the multitude of resource and conservation applications, Hydrography90m can be a vital toolkit for achieving several UN Sustainable development goals. Additional uses of the dataset are relevant to freshwater flows and sediment transport computations, pollutant and nutrient concentration assessments, public health, and geopolitical and resource challenges. To date, Hydrography90m has been used in species distribution modelling for aquaculture, vector-borne disease mapping, and various ecological studies. Institutions engaged in water resource management, transnational security and environmental crime monitoring are also starting to derive value from the dataset’s attributes.
DestinE Core Service Platform integrates and operates an open ecosystem of services (also referred to as DESP Framework) to support DestinE-data exploitation and information sharing for the benefit of DestinE users and Third-Party entities.
DESP includes key essential services such as user management service; infrastructure as a service with storage, network, and CPU/GPU capabilities; data access and retrieval service, in particular from the DestinE Data Lake operated by EUMETSAT, as it is the backbone for the data generated by the ECMWF’s Digital Twin Engine; data traceability and harmonization services; basic software suite service for local data exploitation; data and software catalogue services; and 2D/3D data visualization service.
DESP additionally provides onboarding support for integrating external services and resources, making the ecosystem flexible, scalable, and easily adapted to the community needs. DestinE ecosystem aims to support the needs of a large and diverse community of users including general citizens, scientists and academics, commercial entities, or policy makers. DESP Framework defines the required conditions for a service to be part of the ecosystem, and therefore benefit of the available resources, and of the potential to engage with all DestinE users.
The concept of “Analysis Ready Data” (ARD) was initially developed around 2015 within the Committee on Earth Observation Satellite (CEOS). CEOS defines ARD as “satellite data that have been processed to a minimum set of requirements and organized into a form that allows immediate analysis with a minimum of additional user effort and interoperability both through time and with other datasets”. Over the course of the past few years CEOS has issued a number of so-called ‘Product Family Specifications’ (PFS) which cover a variety of different sensing methods and observed parameters. Institutional and commercial satellite data providers have accepted these specifications and by now a broad variety of satellite image products are available as “CEOS-ARD certified”.
The popularity of the concept and the expectation for the ARD ‘label’ in terms of simplified usability and interoperability of a wealth of EO data has spurred the desire to expand ARD beyond classical EO parameters by including higher level products and to cover geospatial data more generically. This prompted a discussion on which categories, levels, or classes of analysis ready data are needed and how they could be defined and distinguished. These considerations are now taken up by a formal ISO/OGC Standard Working Group (SWG) which was launched recently.
In the meantime, the amount of available geospatial data increases exponentially and many of these are available free&open and calling themselves ‘ARD’. However, users relying on their interoperability are often overwhelmed by their diversity and remaining inconsistencies which often require considerable effort before appropriate data can be selected and joined sensibly. Access to proper reference data and benchmarking methods is therefore an important factor for leveraging ‘ARD’ in the future.
The OpenClimate Network is an open source nested accounting platform allows users to navigate emissions inventories and climate pledges of different actors at every level, aggregating data from various public sources for countries, regions, cities and companies. Through this aggregation, it enables the comparison of how different data sources report emissions of certain actors, by harmonizing the way data is reported and identifying the different methodologies used.
Additionally, by nesting actors into their respective jurisdictions it facilitates the comparison between the pledges these actors have committed to, and to see if they are aligned towards the same climate targets, and how these compare to the goals of the Paris Agreement.
By aggregating data and exploring it in this nested manner, it also allows for the effective identification of data gaps for these actors, suggesting where efforts are needed to identify existing data sources or help produce new inventories. When data gaps are identified, the platform also prompts users to contribute data based on the open and standardized data model used to aggregate emissions and pledges data.
The use of synthetic aperture radar (SAR) data has become increasingly important in remote sensing for environmental monitoring. SAR data provides valuable information on surface characteristics and changes, such as land cover and land use change, soil condition, and vegetation growth, making it a powerful tool for various applications, including agriculture, forestry, and climate change studies. However, processing and integrating SAR data into analysis-ready formats can be complex and time-consuming, requiring specialized knowledge and tools.
In this contribution, we propose a software project called force-sar, which aims to integrate Sentinel-1 data into the Framework for Operational Radiometric Correction for Environmental monitoring (FORCE). FORCE is a widely used data cube framework for processing and analyzing optical remote sensing data and the integration of SAR data into FORCE increases its capabilities for large-scale, multi-modal analysis.
Force-sar automatically queries available Sentinel-1 Ground Range Detected (GRD) imagery covering the spatial and temporal dimensions of your area of interest. The scenes are then directly accessed from satellite data repositories on cloud environments like Creodias or the Copernicus Data and Exploitation Platform Germany (CODE-DE). When no connection to such repositories is available, the data can also be downloaded from data centers like the Alaskan Satellite Facility (ASF). The scenes are processed to radiometrically calibrated gamma-naught backscatter data using a pre-built but customizable ESA SNAP processing graph. After resampling, reprojecting, and tiling the data, they are ready for ingestion into a FORCE data cube.
The integration of Sentinel-1 data into FORCE allows for the creation of SAR data cubes with consistent radiometric and geometric properties, covering large regions and spanning multiple time periods. This enables users to perform multi-sensor and multi-temporal analyses like change detection, compositing, and classification and regression tasks for environmental monitoring at large scales, thereby supporting decision-making and policy evaluation in frameworks like the EUs Green Deal or the Common Agricultural Policy.
Force-sar is already used in the ongoing Mowing Detection Intercomparison Exercise (MODCiX), where more than ten teams compare their algorithms for grassland mowing detection on a consistent and harmonized remote sensing and reference data set. A consistent data cube holding optical and SAR data covering test regions in eight European countries was created and is used for the study.
Force-sar provides a streamlined and fully containerized workflow for preprocessing and integrating SAR data into an existing data cube framework without the need for the installation of any external tools or dependencies. This opens up new possibilities for utilizing SAR data in large-scale environmental monitoring applications, particularly when working in cloud environments.
Earth System Models and Earth Observations are crucial for studying the Earth, providing scientific insight into fundamental dynamics and valuable predictions about Earth’s future. However, they generate huge amounts of data, at different temporal and spatial scales, so it becomes of paramount importance to access them in a seamless and efficient way for scientific analysis. Usually Earth Science datasets are represented with hundreds or thousands of files that can introduce a lot of burden to the user in terms of management, since it requires the user to set up computational and storage resources for accessing and retrieving data and writing code to load and prepare data into in-memory data structures for analysis.
In this talk, we describe in detail the architectural design, implementation and deployment of a data management and analytics system in order to facilitate cataloguing, accessing and processing Earth Science data. The system has been designed using a cloud-native architecture, based on containerized microservices, that facilitates the development, deployment and maintenance of the system itself. It has been implemented by integrating different open source frameworks, tools and libraries and has been deployed using the Kubernetes platform and related tools such as kubectl and kustomize.
The Data Platform consists of different components that will be introduced and described together with the related technologies adopted: (a) the Catalog, based on Intake and MongoDB for cataloguing and indexing the datasets published and managed in the system, (b) the Analytics Engine, based on the geokube and dask Python libraries: geokube is used for specialised geospatial operations (such as extracting a bounding box or a multipolygon) according to different types of geoscientific datasets and dask for parallel and distributed processing; (c) the Broker implemented using RabbitMQ framework for managing the user workload requests; finally, (d) the Rest APIs and the OGC standard interfaces (i.e., WPS) to access data and to submit analytics workflows.
An instance of the Data Platform has been deployed in production at Euro-Mediterranean Centre on Climate Change (CMCC) for the delivery and analysis of data produced by the CMCC Research Divisions. In this talk, we will showcase different Use Cases, related to sectors such as climate change and wildfire management, that demonstrate how the system has been used at CMCC, within different projects and initiatives, for building downstream products and services that need to access, analyse and process Earth Science data.
The establishment of the European Open Science Cloud (EOSC) is one of the eight priorities of the European Open Science Agenda (2018), with the ambition of enabling to federate multidisciplinary research infrastructures. Among ‘Enabling an operational, open and FAIR EOSC ecosystem (INFRAEOSC)’ projects contributing to this priority, Blue-Cloud 2026 and AquaINFRA have been funded to protect oceans, seas, coastal and inland waters, in contribution to achieve the goals of the EU Mission "Restore our Ocean and Waters” by 2030.
Furthermore, the European data strategy (2020) identifies data spaces as the instruments to achieve a single market for data across sectors and countries, through a common and interoperable framework. In particular, the Green Deal Data Space will be interlinked with the EOSC ecosystem demonstrated by the Blue-Cloud 2026 and AquaINFRA projects, involving several research communities and data infrastructures that are contributing to enabling the European Digital Twin of the Ocean.
The main objective of the AquaINFRA project is to develop a research infrastructure equipped with FAIR multi-disciplinary data and services, allowing seamless data discovery and processing through an AquaINFRA Interaction Platform (AIP) in order to support marine and freshwater scientists and stakeholders and interact with EOSC seamlessly. More specifically, the AIP will include developing a cross-domain and cross-country search and discovery mechanism as well as building services for spatio-temporal analysis and modelling through Virtual Research Environments (VREs) where regional case studies are demonstrated highlighting the Mediterranean use cases in this presentation.
The Knowledge Centre on Earth Observation activity is grounded in sound knowledge management practices and cutting-edge NLP technologies. It aims to create a common scaffolding for research projects on the one hand and policy needs on the other.
The User Requirement Database (URDB) stores and validates Core Copernicus Users requirements for Earth Observation (EO) products and applications. The URDB facilitates automated gap analysis and screenings across diverse data spaces in pursuit of the optimal matching pre-existing solution, initially examining Copernicus Services product catalogues, followed by a subsequent exploration of research findings from the EU Horizon programme. The Text Mining Application (TMA) leverages innovative advancements in machine learning utilizing Transformers to facilitate precise semantic document retrieval within an EO-specific subset of research outcomes financially supported by the European Union's research and innovation framework programmes. These programmes and the respective EO project data span from FP1 in 1984 to the most current initiative, Horizon Europe. The primary TMA objective is to empower users with rapid access to research findings for highly specific queries, while simultaneously offering a user-friendly database, an internal microservice as an API, and a GUI interface for more advanced metrics and visualizations. In the future, the URDB and TMA will be closely interlinked and integrated, enabling users of either platform to benefit from rapid access to the actual Copernicus datasets, as well as enhanced meta-information metrics and insight into research outcomes.
In addition to supporting gap and fit for purpose analysis, the main scope of the URDB is to enable requirement retracing across the components of the EO value chain, from policy needs to observations, and therefore supporting and tracking the evolution of the Copernicus Programme. The URDB's records are technology-agnostic quantitative requirements, expressed by verifiable, unambiguous and actionable technical specifications (horizontal resolution, measurement uncertainty, tasking time, etc.). The URDB's data model builds on the experience of existing requirement databases from Copernicus Core Services and international partners (e.g., USGS and NASA). One of the URDB’s and TMA’s core design principles is semantic interoperability: entities, relationships and attributes are clearly defined in a terminology and, when applicable, they follow international standards (ISO, OGC), recommendation and best practices (CEOS, GEO).
From a technical perspective, both, the URDB and TMA are self-hosted open-source databases with a GUI and an application layer for querying and performing analysis. While the URDB is based on PostgreSQL, the TMA utilizes a novel vector database known as Qdrant, which fulfils highly specific AI requirements and offers a user-friendly API.
The vision for both databases is to link them through web APIs to online data catalogues and tightly integrate them into the existing (meta-) data Copernicus landscape.
In-situ data collection is a fundamental part in the domain of ecosystem observations and monitoring. Continuous measurements of energy and matter exchanges at the ecosystems/atmosphere boundary by means of the eddy covariance (EC) technique are fundamental observations across the wide range of in-situ measurements, in particular concerning carbon and other greenhouse gases and water balances. Monitoring stations based on this technique and organised in networks at different scales, from national to global, providing precious insights to different users, are a standard in the environmental sector. The Integrated Carbon Observation System (ICOS) is one of such research infrastructures, working at the European scale. The ecosystem domain of ICOS, dealing with terrestrial observations in natural and anthropic ecosystems, is not only providing EC datasets, but also numerous meteorological and other auxiliary variables to support the activity of the network. In the present work we describe the portfolio of the main products included in a typical ICOS ecosystem station: from continuous measurements of CO2 and H2O exchanges, to the above- and below- ground meteorological parameters, to the discontinuous ancillary measurements of different vegetation characteristics – spanning from tree height to above-ground biomass, from soil characteristics to plant area index, from species distribution to litter mass. The continuous datasets are provided at different scales, from half-hourly to yearly. All the datasets, supplemented by a detailed set of metadata ensuring the consistency with the FAIR principles (Findability, Accessibility, Interoperability, Reusability), are stored on a safe repository and both openly and freely distributed (with license CC-BY 4.0) to researchers, modelers, and any user that requests them.
Raster data cube is a four-dimensional array with dimensions x (longitude / easting), y (latitude /northing), time, and bands sharing a: (1) single spatial reference system, (2) constant spatial cell size, (3) constant temporal duration, (4) temporal reference defined by a simple start and end date / time, resulting in a single attribute value for every dimension combination. Building a data cube consists basically in converting raw irregular raster data into a regular and dense structure, which may include information loss and needs to consider user definitions and application restrictions.
Heat waves are more and more heavily affecting population and this is even more enhanced in cities rather than in the countryside where the urban heat island (UHI) effect worsen their duration and intensity. Current research on the estimation of the UHI effect adopts 3 main approaches: i) observational studies describing its driving processes, spatial patterns and/or magnitude; ii) Earth observation (EO) studies focusing on the surface urban heat island (SUHI) determined from land surface temperature (LST) retrieved from satellite thermal sensors (e.g. Landsat-TIRS, Sentinel-SLSTR); iii) modeling via mesoscale meteorological models like the Weather Research and Forecasting (WRF) or microscale models (i.e. ENVI-met) that require large computational effort and/or fine tuning of parameters. Municipalities need actionable data to support their decisions. It is thus crucial to develop intermediate approaches for the estimation of the UHI intensity and spatial extension without the need of advanced expertise to tune parameters or run complex meteorological models but, at the same time, able to provide reliable insight into the urban air temperature. The work presented in this contribution is performed in the framework of the USAGE project activities and focuses on providing a pipeline for the development of UHI maps in urban areas utilizing open data like EO, IoT ground sensor data, surface properties and a hybrid model based on machine learning and geostatistics. We present a pipeline that can be deployed with minor adaptation (i.e STAC end points) within GIS software environments. The ground sensor data are accessed via OGC SensorThings API and fed into the analysis. Pre-loaded 'semi-static' layers, like DTM, DSM, LU/LC, vegetation fraction, urban building morphology and shade maps are accessed via OGC Feature API and utilized to spatialize the air temperature at each time stamp received from the IoT sensors. Based on the revisit time of EO thermal data and its cloud-coverage level, LST observations are integrated to help the spatialization of the ground sensor's temperature, performed using a hybrid model combining machine learning and geostatistics. This allows for faster computation compared to classical geostatistics but, at the same time, to explicitly handle spatial correlation of data and errors. The aforementioned pipeline is suitable to derive UHI maps from given IoT ground sensor data. On the other end, to forecast the UHI effect up to 48 h, the pipeline ingest 2-m air temperature, relative humidity as well as wind speed and direction from the meteorological models (WRF or AROME) the open-meteo API.
The proposed pipeline is applied in the Alpine valley and city of Trento (Italy) and is then validated against high-resolution simulations with the WRF model, offline coupled with an urban parameterization scheme to reach a resolution of 100 m. The proposed pipeline can be used not only as a forecasting tool, but also as a UHI mitigation and planning tool by changing the ‘semi-static’ layers that involve the study area. This allows municipalities to predict the effects of their decisions.
Earth Observation (EO) applications enable decision-makers, researchers, and specialists to understand the phenomena of our planet, allowing global changes to be made from local actions taken by the public and private sectors. With the dissemination and use of Open Data practices, the EO applications have been enhanced, allowing numerous works to be developed, ranging from the analysis of anthropic actions on inland waters to the temporal analysis of land use and land cover changes. These advances and improvements in EO applications have made their development complex, requiring several materials to be used together with the data to compose the results. Consequently, organizing, sharing, and preserving these applications and the knowledge within them to enable reproduction and replication has become a challenge. Often these activities require specific expertise from researchers and specialists and technical infrastructure.
The Group on Earth Observations (GEO) and its community promote Open Data practices, being responsible for defining guidelines and developing the Global Earth Observation System of Systems (GEOSS) that enhances access to EO data. Recently, GEO started the development of a new component of the GEOSS ecosystem, the GEO Knowledge Hub (GKH), to foster the reproduction and replication of EO applications. Created based on the GEO Data Sharing Principles and the GEO Data Management Principles, the GKH allows users to share their EO applications and the underlying resources (e.g., processing scripts, datasets, and description notes), enabling people to understand, reproduce and replicate the shared EO application. In the GKH, the resources of an application can have files (e.g., satellite imagery datasets, in-situ data files) and metadata (e.g., title, authors, spatial location). Furthermore, each resource can be associated with an individual persistent identifier (DOI) created by the GKH, enhancing dissemination and citation.
Applications shared on the GKH can be found and used, making their knowledge accessible. For this, the GKH provides high-level features for organizing the application materials and facilitating their sharing. In addition, the GKH has a powerful search engine that enables textual, thematic (e.g., Sustainable Development Goal-oriented search), and spatial-temporal searches. In addition to share and search capabilities, the GKH provides features that facilitate community engagement, such as discussion sections (Real-time Q&A) and a feedback system. All these features are accessible through high-level web interfaces and Rest APIs, allowing the integration of various tools to use the digital repository.
The GKH is already being used to share and preserve many EO applications. For example, several GEO Work Programme Activities store their applications in the GKH (e.g., GEOGLAM, GEOVENER, Digital Earth Africa, and many others). These and other use cases have shown positive results, indicating that the GKH can assist in organizing, sharing, and preserving the knowledge generated in EO applications. Therefore, in this workshop, we will introduce the main concepts of the GKH, guidelines, and practices in how users can use it to share and preserve EO applications.
This is the story of 2 twin projects (namely AIR-BREAK and USAGE) undertaken by Deda Next on dynamic sensor-based data, from self-built air quality stations to the implementation of OGC standard compliant client solution.
In the first half of 2022, within AIR-BREAK project (https://www.uia-initiative.eu/en/uia-cities/ferrara), we involved 10 local high schools to self-build 40 low-cost stations (ca. 200€ each, with off-the-shelf sensors and electronic equipment) for measuring air quality (PM10, PM2.5, CO2) and climate (temperature, humidity).
After completing the assembling, the stations were provided to high schools, private households, private companies and local associations. Measurements are collected every 20 seconds and pushed to RMAP server (Rete Monitoraggio Ambientale Partecipativo = Partecipatory Environmental Monitoring Network - https://rmap.cc/).
Hourly average values are then ingested with Apache NiFi into the OGC’s SensorThings API (aka STA) compliant server of the Municipality of Ferrara (https://iot.comune.fe.it/FROST-Server/v1.1/) based on the open source FROST solution by Fraunhofer Institute (https://github.com/FraunhoferIOSB/FROST-Server). STA provides an open, geospatial-enabled and unified way to interconnect Internet of Things (IoT) devices, data and applications over the Web (https://www.ogc.org/standard/sensorthings/).
In second half of 2022, within USAGE project (https://www.usage-project.eu/), we released the v1 of a QGIS plugin for STA protocol.
The plugin enables QGIS to access dynamic data from heterogeneous domains and different sensor/IoT platforms, using the same standard data model and API. Among others, dynamic data collected by the Municipality of Ferrara will be CC-BY licensed and made accessible from municipal open data portal (https://dati.comune.fe.it/).
The Green Deal Data Space (GDDS) will interconnect current fragmented and dispersed data from various ecosystems, both from the private and public sectors to facilitate evidence-based decisions and expand the capacity to understand and tackle environmental challenges, for example, for monitoring and reaching environmental objectives in biodiversity, resilience to climate change, circular economy and zero pollution strategies.
This workshop will be partaken by projects EuroGEO Action Group for the Green Deal Data Space in their quest to push the boundaries of data provision, and ensure a FAIR and TRUSTworthy data is available for building a more sustainable future. Some outcomes of the workshop may contribute to the new adhoc ISO TC211 working group on data spaces. Within this Workshop, the following projects will present their current approaches towards enabling the GDDS:
AD4GD: The aim is Integrate standard data sources (e.g. Insitu, RS, CitSci, IoT, AI) in the GDDS, improve semantic interperability, and demonstrate with concrete examples that climate change zero pollution, biodiversity general problems can be solved.
FAIRiCUBE: The core objective is to enable players from beyond classic Earth Observation domains to provide, access, process, and share gridded data and algorithms in a FAIR and TRUSTable manner. We are creating the FAIRiCUBE HUB, a crosscutting platform and framework for data ingestion, provision, analysis, processing, and dissemination, to unleash the potential of environmental, biodiversity and climate data through dedicated European data spaces.
USAGE (Urban Data Space for Green Deal) will provide solutions for making city-level data (Earth Observation, Internet of Things, authoritative and crowdsourced data) available, based on FAIR principles: innovative governance mechanisms, standard-based structures and services, AI-based tools, semantics-based solutions, and data analytics. It will provide decision makers with effective, interoperable tools to address environmental and climate changes-related challenges.
B³ - Global biodiversity is changing under multiple pressures including climate change, invasive species and land-use change. Yet biodiversity data are complex and heterogeneous, making it difficult to understand what is happening fast enough for decision makers to react with evidence-based policies. To solve this B³ will create Open workflows in a cloud computing environment to rapidly and repeatedly generate policy relevant indicators and models of biodiversity change.
GREAT: Funded by the Digital Europe program, aims to establish the Green Deal Data Space Foundation and its Community of Practice which builds on both the European Green Deal and the EU’s Strategy for Data. The project will deliver a roadmap for implementing and deploying the Green Deal Data Space, an infrastructure that will allow data providers and initiatives to openly share their data to tackle climate change in a multidisciplinary manner.
The Open Earth Monitor Consortium is working to contribute infrastructure to the GDDS.
Some remote sensing signals provide valuable information only at spatial scales that are too coarse for comfort for various applications. Examples include sun-induced chlorophyll fluorescence (SIF), whose signal is related to gross primary productivity (GPP) of vegetation and how this is impacted by stress, or vegetation optical depth (VOD), which is related to how water content is distributed within a canopy, which itself is informative on forest biomass and structure. Within OEMC WP6, we are developing an EO spatial downscaling framework that will be specifically tailored towards improving carbon flux estimations. The spatial downscaling will employ finer resolution EO variables to achieve this super-resolution, but this will not be done merely numerically. It will instead combine our process-based knowledge of how these variables relate to each other to develop a hybrid modelling approach with knowledge-guided AI. This will further be implemented using a moving window adaptative approach over a spherical grid, which will be made possible by novel developments from OEMC WP3. The use case within OEMC is to develop SIF-based 1km spatial resolution GPP flux estimations based on measurements from the TROPOMI sensor on Sentinel-5P, which has a spatial resolution that is larger than 5 km. The main stakeholder for this product will be the Global Carbon Project (GCP), and in particular RECCAP, which aims to establish the greenhouse gas (GHG) budgets of large regions covering the entire globe at the scale of continents. Such endeavor would greatly benefit from the fine-level spatialized GPP fluxes we aim to provide. This use-case will largely leverage on in-situ data provided in OEMC WP4 for validation and calibration, particularly after specific developments to optimize the matching between in-situ points and remote sensing grids is achieved. This presentation will detail the blueprint for this task, which will combine synergistic efforts across various elements within the OEMC project.
Climate change is profoundly affecting the global water cycle, increasing the likelihood and severity of extreme water-related events. Better decision support systems are essential to accurately predict and monitor water-related environmental disasters and to manage water resources optimally. These will need to integrate advances in remote sensing, in-situ and citizen observations with high-resolution Earth system modelling, artificial intelligence, information and communication technologies and high-performance computing.
The Digital Twin of the Earth (DTE) for the water cycle is a breakthrough solution that provides digital replicas to monitor and simulate Earth processes with unprecedented spatial-temporal resolution and explicitly including the human component into the system. To get the target, advances in observation technology (satellite and in situ) and modelling are pivotal. The workshop will serve the community to assess the state of the art of these technologies and to identify challenges to be addressed in the near future.
While land is increasingly degrading, robust monitoring approaches are required to identify land degradation processes and to ultimately tackle those. Land degradation is commonly assessed by comparison with the immediate past or surrounding. Comparison with its natural potential, however, i.e. a state of minimal human impact, could give further insights into the full degree of degradation, accounting for the “shifting baseline syndrome”, the gradual change in perception of what is considered the reference. Primary production is one of the key indicators for determining impacts of land degradation, which can be approximated by FAPAR, the fraction of absorbed photosynthetic active radiation, a metric directly related to primary productivity. Here, we present a novel methodology to assess land degradation in reference to its natural potential. Using a machine learning model approach, global time series maps spanning 2000 - 2022+ will be generated by simulating potential natural FAPAR in the hypothetical space of minimal human impact. This will allow performing gap analyses of actual and potential natural FAPAR to monitor impacts of land degradation and restoration efforts through time. Use-case scenarios on country level and project investment level will be demonstrated in the context of supporting UNCCD targets for land degradation neutrality (LDN). This research is carried out within the Open Earth Monitor Cyberinfrastructure project (OEMC) and received funding via the European Union's Horizon Europe programme under grant agreement No.101059548.
openEO develops an open API to connect R, Python, JavaScript and other clients to big Earth observation cloud back-ends in a simple and unified way.
openEO Platform implements the openEO API in an federated cloud platform. Hence, it allows to process a wide variety of earth observation datasets in the cloud.
Users interact with the API through clients. This demonstration shows the usage and capabilities of the main clients: The Web Editor, the Python Client and the R-Client.
The Web Editor is a webtool to interactively build processing chains by connecting the openEO processes visually. This is the most intuitive way to get in touch with openEO Platform.
The Python Client and the R-Client are the openEO Platform entry point for programmers. The are available via Comprehensive R Archive Network (CRAN). They facilitate the interaction with the openEO API within the respective programming languages and integrate the advantages of the available geospatial packages and typical IDEs.
The classroom training teaches users how to accomplish their first a round trip through a typical openEO Platform workflow: login to openEO Platform, data and process discovery, process graph building adapted to common use cases, processing of data and the visualization of results
By combining the approaches of the visually interactive Web Editor and the programming based clients users are introduced stepwise to the concepts of openEO Platform and will gradually understand the logic behind openEO.
Extreme weather and outbakes phenomena are occurring more and more frequently and are affecting forests throughout the Alpine region. Efficient and targeted forest management on a municipal, provincial or regional scale requires high quality, information-rich remote sensing data. Airborne hyperspectral imagery enables the acquisition of high-resolution data on entire portions of land in a short space of time, whereby the high amount of spectral information provides efficient tools for forest managers, such as mapping forest species, identifying invasive species, mapping bark beetle damage, calculating narrowband vegetation indices and analyzing health status. AVT Airborne Sensing Italia (AVT-ASI) uses the Specim AisaFenix sensor for hyperspectral image acquisition. The sensor works in the VNIR and SWIR spectral ranges and acquires 384 bands in pushbroom mode. At the beginning of October 2022, AVT-ASI acquired hyperspectral images over a forest area with size 350 km² near Bruneck, in the South Tyrol province, Italy, (Figure 1 a and b) to support the local forestry inspectorate in the evaluation of the forest health status. Indeed the area of interest has been strongly affected by the spread of the bark beetle, probably due to the damages caused by Vaia storm in 2018, followed by dry periods. The AisaFENIX images were preprocessed to correct atmospheric, radiometric and geometric effect, and then the most frequent tree species were mapped with machine learning algorithms. For the Picea abies (Norway spruce) class, further analysis were conducted using multiple narrowband vegetation indices (Figure 1 c to h), in order to assess the health status of the trees (Figure 2) and detect the effects of the presence of the bark beetle. The results have been validated by the forestry inspectorate with ground surveys. The georeferenced thematic product obtained by the hyperspectral aerial images resulted to be very useful for optimal forest management, in particular for the identification of possible infected trees at an early stage (green-attack) and the implementation of mitigation measures. The information was available at a degree of accuracy that is not achievable by VNIR + SWIR images acquired by satellite platforms, due to the low spatial resolution. However the availability of regular and frequent satellite images has the potential to allow for temporal analysis and change monitoring, starting from the detailed as-is situation obtained from the aerial hyperspectral images.
The presentation will show the scientific approach of the work done and critically discuss the achieved accuracy in the intermediate and final products.
Estimating crop yields timely, is pivotal for official statistics on agricultural productivity to inform policy-making on sustainable food production. Existing approaches of collecting yield data annually from a large number of farms are resource-intense, though. Official crop yield statistics in Germany, for instance, relies heavily on extensive and time-consuming farm surveys and on-farm measurements.
The EU’s Copernicus earth observation (EO) program provides a plethora of satellite data, enabling the remotely sensed monitoring of agricultural land at high spatio-temporal resolution. EO imagery, open geospatial data on meteorological conditions and soil properties as well as advances in machine learning (ML) provide huge opportunities for model based crop yield estimation, covering large spatial scales with unprecedented granularity. Managing vast amounts of multi-source data required for yield modelling remains a challenge, though, particularly for public authorities. We present a model-based approach to estimate yields of multiple major crops cultivated in Germany, by employing ML ensembles, using a cloud-integrated spatial data infrastructure (SDI). Our SDI is built on interconnected components linking EO cloud computation and data storage, using the CODE-DE platform, with internal data cubes through web services.
Our model-based yield estimation approach integrates a number of dynamic and static predictors. Analysis ready data of multi-spectral Sentinel-2 imagery is used for space-borne retrieval of crop traits such as leaf area index and above ground biomass. Geospatial data on meteorological time-series are queried from our data cube, providing daily variables such as temperature, precipitation, and global radiation. External geospatial data on soil moisture and physicochemical soil properties are obtained from the Copernicus Global Land Service and SoilGrids 2.0 data portals, respectively. Crop-specific ML models are trained on multi-annual data (2018 - 2022) collected at agricultural parcel level for three crops, i.e. winter wheat, winter barley, and winter rape. The ensemble of ML regressors employed, includes Gradient boosted trees (CatBoost, LightGBM, XGBoost), Partial Least Squares, RandomForest, and Support Vector Machines. Parcel geometries obtained from the Integrated Administration and Control System (IACS) enable the spatially scaled application of trained yield models, covering larger administrative regions represented by two federal states.
RSQ values of best performing models, inferred from cross validations at parcel level, range between 0.67 - 0.74. Related normalized RMSE (nRMSE) values range between 12 - 19%. Aggregated yield estimates at district level compared against mean yields at district level obtained from official yield statistics for 2020 and 2021 show RSQ values for best performing models, ranging between 0.57 - 0.85. Related nRMSE values range between 5 - 10%.
Preliminary results are promising, suggesting several advantages compared to traditional yield estimation approaches, regarding area coverage, cost effectiveness, and timeliness. Our cloud-integrated SDI used as backbone enables full scalability for crop yield estimation at national scale. However, high quality training data inferred from a representative sampling across the country and open data access for IACS parcel geometries are required to lift current scalability barriers.
In recent years, several new satellite constellations have been put into service. This, together with the new policies for open data distribution, dramatically increased the availability of time-series with high temporal resolution.
The new widespread availability of high temporal resolution imagery has led to paradigm shift from change detection techniques where pairs of images are compared searching for abrupt changes (e.g. forest fires, forest cuts), to methods capable of tracking changes continuously in time. In particular, time-series allows for the monitoring of subtle and gradual changes for which the definition of a pre and post event date is not straightforward (e.g., vegetation stress caused by drought, bark beetle outbreaks) and anthropogenic processes happening at a finer timescale (e.g. mowing events).
Such data availability, together with increasing ease of access to both offline computing power and to cloud based computing platforms and new tools for data processing, is leading to the development of a wide variety of applications for near real-time monitoring using Earth Observation (EO) data intended to be used in decision making processes (e.g., forest management) by stakeholders such as government agencies. In this context, we present monitoring tools, implemented on the Google Earth Engine platform, that exploit spaceborne EO data to support decision making in Alpine environments affected by two threats connected to global change: pests outbreaks and land use intensification.
After the Vaia storm in 2018, bark beetle outbreaks have become more frequent in the Alps with estimates, at the end of 2022, of 8000 hectares infested by the pests only in the Trento province. Such phenomena must be monitored by detecting both past and new outbreaks. This is critical for the definition of recovery strategies for the affected areas and mitigation strategies to limit the spread of new outbreaks. The developed tool analyzes long Sentinel-2 time-series for bark beetle outbreaks mapping, generating a product that identifies the area hit by an attack and the first year and month of the detection. By processing new images as they are acquired, it performs a near real-time monitoring highlighting new attacks as soon as they are visible from the satellite data. This tool is currently being used by the Forest Service of the Province of Trento that is providing the generated products to the local stations.
The second tool we present uses vegetation indices time-series derived from Sentinel-2 imagery to estimate grassland mowing frequency. Grasslands in Europe are facing management intensification in accessible areas and abandonment in marginal ones, with significant consequences not only for grassland productivity, but also for fodder quality, nitrogen leaching, animal and plant diversity and grassland recreational value. For these reasons the availability of grassland mowing frequency data can contribute to the development of more targeted conservation and management measures. The model is now being used in several research and management contexts, including CAP subsidies conditionality monitoring and habitat suitability for ground nesting endangered bird identification.
Land cover changes affect the climate system at local, regional, and global scales. Previous studies have indicated that the changes in vegetation distribution have an impact on the land surface temperature and the energy balance at local and global scales. Assessing the effect of land cover change on climate variables is a fundamental step in understanding how deforestation and reforestation processes will impact the climate dynamics. At the same time, this knowledge can be of utmost importance for the design of reforestation or afforestation plans, such as those envisaged within the European Union’s Green New Deal, but also more generally, in any part of the world. In this talk, we will present some preliminary results on the effect of land cover change on climate variables for Europe and Africa. To develop the studies, we will develop a first technical implementation of the space for time technique in the programming language Julia. The space for time technique estimates the average change of local climate if the land cover changes from one class to another. For example, savannas are usually hotter than neighboring forest, then contrasting the local climate conditions we can evaluate the potential effect of the transition from forest to savannas. In the current state of Earth observation using satellite imagery, downloading large amounts of information is no longer feasible because the amount of information is many times larger than the infrastructure of research institutions and organizations. In this context, the development of software compatible with cloud computing infrastructure is more important than ever. Julia is a dynamic programming language focused on high-performance computation and easy scalability, following the philosophy of “write like python, run like C”. Despite being a relatively new programming language (11 years old), the use of Julia in science and cloud computing has grown exponentially recently, as more and more institutes and companies adopt it. For these reasons, our implementation of the space-for-time technique on the Julia programming language, will allow scientists and organizations to efficiently perform the analysis from laptops to remote servers and platforms.
In this talk we introduce a community led initiative part of the wider Open Innovation framework at European Space Agency that works to develop an open, interactive, user intuitive platform for a constantly updated, comprehensive and detailed overview of the dynamic environment of the open source digital infrastructure for geospatial data storage, processing and visualisation systems. OSS4gEO is designed as a repository that functions as an extended metadata catalogue, curated by the community and a tool for metrics computation, visualisation, ecosystem statistical analysis and reporting.
The initial development of the Open Source for Geospatial Software Resources platform builds on previous extensive work started in 2016 that has materialised into a pioneering overview of open source solutions for geospatial, voluntarily updated by the team. Starting in 2023, OSS4gEO has become a part of a wider ESA Earth Observation (EO) Open Innovation initiative to actively support and contribute to the EO and geospatial open source community and it is intended as a seed action to better understand, represent and harvest the geospatial open source ecosystem.
There are 3 main objectives that OSS4gEO aims to achieves:
(1) It aims to offer an informed and as complete as possible overview of the open source for geospatial and EO ecosystem, together with various capabilities of filtering and visualisations, within the platform as well as technical solutions to programmatically access and extract data from the database (APIs) to use in any purpose, including commercial;
(2) It aims to provide guidance through the complexity of the geospatial ecosystem so that one can choose the best solutions, while understanding their sustainability, technical and legal interoperability and all the dependencies levels;
(3) It aims to serve as a community building, a promoting and maintaining platform for new and innovative open source solutions for EO and geospatial, developed within various projects, research centres, small or large companies, universities or through individual initiatives.
In the framework of Earth Observations (EO), in-situ measurements are a fundamental pillar for the characterisation of ecosystem behaviour. Vegetation responses to stressors, trends and changes in ecological functioning, species abundance and characterisation are only a few examples of currently available in-situ datasets worldwide. The combination of in-situ timeseries with other EO products, such as remote sensing and other geospatial datasets, gives rise to the possibility of characterising, modeling and predicting ecosystem functionalities and dynamics from local to global scales. In this framework, the Horizon-Europe funded project Open-Earth-Monitor Cyberinfrastructure (OEMC) aims at collecting a wide range of such datasets, elaborating them together with other EO products, and creating specific technological tools to ease their sharing and usability. A consistent part of the project is dedicated to gathering and analysing a huge in-situ datasets portfolio, characterized by a large variety in terms of data types, scales, accuracy and documentation. In-situ observations potentially available to the project span from continuous monitoring (e.g. greenhouse gas fluxes) to sampling campaign (e.g. species distribution), from half-hourly to yearly scales, from highly-standardised datasets to citizen science observations, from remote sensing datacubes to single tree measurements, from vegetation to fauna checklists, from terrestrial to freshwater habitats, and so forth. The need for harmonisation is huge, especially concerning the relevant metadata. In the present poster we report on the main characteristics of such in-situ datasets, including the spatial and temporal scales, accessibility, format and standardization.
Discrete Global Grid Systems (DGGS) tessellate the surface of the earth with hierarchical cells of equal area, minimizing distortion and loading time of large geospatial datasets, which is crucial in spatial statistics and building Machine Learning models. Successful applications of DGGS include the prediction of flood events by integrating remote sensing data sets of different resolutions, as well as vector data. Here we present DGGS.jl: An analysis framework for scalable geospatial analysis written in the Julia programming language. Bindings from the C++ library DGGRID were created to convert between geographical coordinates and DGGS cell ids, as well as to provide several projections and grids. An efficient data structure and chunking scheme based on data cubes and Zarr-arrays was created to store remote sensing data of different resolutions, structured in accordance with the selected grid. This provides the basis for fast and accurate ML modeling, especially distortion-less and spatially aware Graph Convolutional Neural Networks. Furthermore, the hierarchical cell structure of a DGGS enables multiscale modeling, in which regions of interest can be represented in a higher resolution than others.
The European Commission Knowledge Centre on Earth Observation (KCEO) is developing two use cases, aiming to provide sustained last-mile products and services emerging from the analysis of EU policy needs, EU Framework Programmes for Research and Innovation and GEO activities.
The first use case to transfer into a sustained product and service has emerged from the evaluation of research projects in connection to EuroGEO. In this respect, the EuroGEO Showcases: Applications Powered by Europe (e-shape) project, having received funding from the H2020 programme and aiming to ensure the optimal implementation of EuroGEO is the primary project to evaluate. The pilots of the e-shape project were the candidate use cases considered for implementation and further development by KCEO. They have been evaluated by criteria that consider policy relevance and technical aspects, such as data sources and infrastructures and European principles related to these. As a result of the evaluation, KCEO selected the photovoltaic energy assessment at an urban scale pilot led by ARMINES as the first adopted project. Through the implementation of use cases in a prototypical EuroGEOSS virtual ecosystem, it will also be possible to define the good practices and technologies to be used in the future, operational EuroGEOSS more thoroughly. The use case will be shaped according to the needs of the policy Directorate-Generals (DGs) through the KCEO Deep Dive on Climate Change Adaptation in Urban Areas in collaboration with ARMINES.
The second use case focuses on monitoring wetlands’ change and degradation processes across EU Natura 2000 (N2K) sites. The use case has been developed from the identification of DG Environment (ENV) policy needs at the policy implementation and evaluation stages of the Habitats Directive and implements the results of the chapter on wetlands of the KCEO Deep Dive on Biodiversity assessment (upcoming science for policy report in 2023). The goal of the use case is to cover the last mile of the EO value chain to enable the full exploitation of EO products and derived products and to foster their uptake in policy making. This should result from 1) requirements translation and co-development with DG ENV stakeholders, 2) fitness for purpose analysis of existing products and applications, 3) co-design of the web application information content, graphics and features, 4) identification of gaps and recommendations for the improvement of products and 5) provision of a working prototype as a proof of concept for an operational service. The project will be framed around three spatial scales of interest: pan-European, river basin and N2K site and four application services: habitat mapping, pressure and condition trend analysis, pressure and condition monitoring and hotspot analysis. The use case acknowledges the importance of integrating EO data from multiple sensors with other data, including hydrological modelling outputs.
The adoption and development of these use cases will answer the specific needs of the EU policies, increase the use of Copernicus data and services and provide visibility to the projects identified generating more value from the investments already made.
In the context of global change, the biosphere has been experiencing systematic changes. Biospheric changes are not only linked to the atmosphere via the variability and changes of climate but also linked to socio-economic drivers. Socio-economic and climate drivers may both act either slowly, causing trends, or abruptly, causing extreme events, shocks, or tipping points. However, the relative importance of climate and socio-economic factors for the biosphere dynamics and their moderating mechanisms is not well understood and may vary spatially. To gain insights into the links between climate, biosphere, and society, we study the relationships between the trajectories in the three different domains by analyzing multi-stream global data from 2001-2020, including a biospheric and climate data cube and subnational socio-economic data. We hypothesize that climatic change is a globally widespread driver, almost relevant everywhere, but can be additionally mediated by socio-economic drivers (e.g. land-use and freshwater management). Another hypothesis is that social-economic shocks can lead to an unsystematic shift of biospheric resilience after climate extremes. This study aims to quantify biospheric responses to climate and society from data-driven signals, and has an essential implication on understanding human footprints on biospheric changes globally.
this workshop focuses on modern architectures built around cloud computing intended for the processing of EO data and facilitation of relevant applications. among the tools discussed are machine learning enabled modules that support data classification, annotation and compression. Such tools combined with data fusion and semantic information processing transform EO primitive data into meaning rich data sets that directly match application (vertical) requirements. this suite of tools is analytically presented and discussed in details throughout the workshop.
OpenLandMap is a not-for-profit open data system providing data and services to help produce and share the most up-to-date, fully documented (potentially to the level of fully reproducibility) data sets on the actual and potential status of multiple environmental variables. The layers include soil properties/classes, relief, geology, land cover/use/degradation, climate, current and potential vegetation, through a simple web-mapping interface allowing for interactive queries and overlays. This is a genuine Open Land Data and Services system where anyone can contribute and share global maps and make them accessible to hundreds of thousands of researchers and businesses. We currently host about 15TB of data including 1 km daily and monthly climatic products (min, max temperature and precipitation), map of potential natural vegetation, 250 m MODIS terra products, 100 m land cover and land use maps and soil properties, 30 m land cover maps and digital terrain parameters. We are inspired by de-centralized open source projects such as Mastodon, OpenStreetMap, and OSGeo projects including the R project for statistical computing.
Urban land use and surface properties play a major role in determining the quality of life for citizens and for urban planning. They have a strong impact alongside extreme events and phenomena, such as flash flooding due to heavy rains on a highly impermeabilized city, Urban Heat Island (UHI) intensification during heat waves, biodiversity reduction in green areas, etc.
To allow the study of such phenomena, high-resolution aerial and terrestrial data acquired with different sensors can be merged and processed with Machine Learning (ML) approaches in order to describe and predict the state of urban landscapes and create data-driven actionable insights.
Within the USAGE - Urban Data Space for Green Deal - project [https://www.usage-project.eu/], this work investigates the integration of multi-source data in urban areas for environmental analyses. Two pilot cities are considered, Graz (Austria) and Ferrara (Italy), where multispectral, thermal, hyperspectral and LiDAR data were acquired from aerial flights.
Firstly, all multi-modal data were processed and co-registered in order to align them. Then, the proposed workflow uses aerial hyperspectral images to classify the surface material with ML algorithms (16-18 classes, normally), thanks to the availability of spectral information in the VNIR and SWIR ranges. The material properties are used to support the calculation of land surface temperatures (LST) from the aerial thermal images acquired in the LWIR range. The operation is critical in case of special materials, like metals, that originate false temperature values in the thermal images, given their low emissivity. As ground truth for the LST estimation, a series of ground measurements are performed during the thermal flights. The comparison of the temperatures measured on the ground and from the thermal camera underlines the influence of the atmosphere, and therefore the need of rigorous modeling for the correction of atmospheric absorption and scattering. The LST values derived from aerial thermal images are compared to those retrieved from Landsat TIRS images, in order to characterize the representativeness of the Landsat pixel over the urban landscape. Finally, the LiDAR point clouds can be enriched with the outcomes of the thermal and hyperspectral analyses for a more realistic and exploitable visualization of the territory.
The proposed workflow has been tested first on the data acquired on the city of Graz in 2021, then replicated and validated on the data acquired on the city of Ferrara in 2022. The proposed methodology could be replicated also in other similar cities to gain more insight from LST retrieved from Earth Observation data that are less resolute in space (~ 70 m) but guarantee higher revisit time, thus allowing for monitoring the evolution of the land cover within the urban environment.
The System of Environmental Economic Accounting has been developed by the global statistical committee under auspices of the United Nations Statistical Division. It comprises two complementary parts: the Central Framework and Ecosystem Accounting, that jointly allow recording and analysing environmental data, and connecting environmental information to economic activities including indicators such as GDP. The SEEA uses a suite of connected physical and monetary indicators that are included in spatial datasets as well as accounting tables. This presentation will introduce the audience to the SEEA, show how spatial data (mainly from Earth Observation) is used in SEEA, and demonstrate how SEEA can be used to bring spatial data to a wide range of users. It will also link the SEEA to various EU policy initiatives including the Green Deal.
This talk will discuss how Brazil's National Institute for Space Research (INPE) is transitioning from visual interpretation to automatic classification of its Amazon deforestation monitoring system. The presentation will discuss the methods for machine learning using time series of Sentinel-2/2A images that have managed to reach the same accuracy as remote sensing experts.
Global land use and land cover monitoring is crucial for understanding and addressing the impacts of land use change on the environment and society. The World Resources Institute’s Land and Carbon Lab is dedicated to advancing this field through the development of cutting-edge monitoring tools, technologies, and partnerships. In this presentation, we will showcase and review available products for global land use and land cover monitoring and highlight the ways in which these products are being leveraged to drive positive change. We will explore the challenges of aligning Earth Observation data with policy, including spatial and temporal challenges as well as challenges aligning land use definitions to land cover monitoring products. The presentation will conclude with a discussion of the future directions for this exciting field, and the opportunities to enhance its impact and value to stakeholders.
In response to the global climate and sustainability crisis, many countries have expressed ambitions goals in terms of carbon neutrality and a green economy. In this context, the European Green Deal comprises several policy elements aimed to achieve carbon neutrality by 2050.
ESA is initiating various efforts to leverage on space technologies and data and support various Green Deal ambitions. The ESA Space for Green Future (S4GF) Accelerator will explore new mechanisms to promote the use of space technologies and advanced modelling approaches for scenario investigations on the Green Transition of economy and society.
A central element of the S4GF accelerator are the Green Transition Information Factories (GTIF). GTIF takes advantage of Earth Observation (EO) capabilities, geospatial and digital platform technologies, as well as cutting edge analytics to generate actionable knowledge and decision support in the context of the Green Transition.
A first national scale GTIF demonstrator has now been developed for Austria.
It addressed the information needs and national priorities for the Green Deal in Austria. This is facilitated through a bottom-up consultation and co-creation process with various national stakeholders and expert entities. These requirements are matched with various EO industry teams that
The current GTIF demonstrator for Austria (GTIF-AT) builds on top of federated European cloud services, providing efficient access to key EO data repositories and rich interdisciplinary datasets. GTIF-AT initially addresses five Green Transition domains: (1) Energy Transition, (2) Mobility Transition, (3) Sustainable Cities, (4) Carbon Accounting and (5) EO Adaptation Services.
For each of these domains, scientific narratives are provided and elaborated using scrollytelling technologies. The GTIF interactive explore tools allow various users to explore the domains and subdomains in more detail to investigate better understand the challenges, complexities, and underlying socio-economic and environmental conflicts. The GTIF interactive explore tools combine domain specific scientific results with intuitive Graphical User Interfaces and modern frontend technologies. In the GTIF Energy Transition domain, users can interactively investigate the suitability of locations at 10m resolution for the expansion of renewable (wind or solar) energy production. The tools also allow investigating the underlying conflicts e.g., with existing land uses or biodiversity constraints. Satellite based altimetry is used to dynamically monitor the water levels in hydro energy reservoirs to infer the related energy storage potentials. In the sustainable cities’ domain, users can investigate the photovoltaic installments on rooftops and assess the suitability in terms of roof geometry and expected energy yields.
GTIF enables users to inform themselves and interactively investigate the challenges and opportunities related to the Green Transition ambitions. This enables e.g. citizens to engage in the discussion process for the renewable energy expansion or support energy start-ups to develop new services. The GTIF development follows an open science and open-source approach and several new GTIF instances are planned for the next years, addressing the Green Deal information needs and accelerating the Green Transition. This presentation will showcase some of the GTIF interactive explore tools and provide an outlook on future efforts.
The monitoring of soil organic carbon (SOC) in cropland is crucial, as soil health plays an important role in ensuring sustainable agricultural productivity and in reducing carbon emissions. Remote sensing technologies offer a promising opportunity to monitor soil properties on a large scale. Bare soil maps can be derived from satellite images and used to estimate soil properties, like the SOC content, based on spectral properties. However, analyzing such data requires processing large amounts of information from different sensors and time frames, which can be challenging. The Framework for Operational Radiometric Correction for Environmental Monitoring (FORCE) offers an interface to handle and analyze satellite data cubes and has been successfully used to create analysis-ready data in numerous studies. This work focuses on the generation of soil reflectance composites based on the FORCE data cube and the construction of a bare soil data cube to estimate trends of dynamic soil properties with spatio-temporal machine learning.
Using FORCE, the generation of bare soil maps is optimized for large areas and different time intervals. This can be challenging, as the soil properties across large scales are diverse, especially in regions with extreme conditions. New methods like spatial filters and dynamic thresholds are implemented to improve the representation of different soil regions and to reduce the noise of the soil reflectance composites. The results are stored in a bare soil data cube which is then used as a basis for soil monitoring. By combining this information with other environmental variables, such as climate and topology, the data cube approach can provide a basis to map soil dynamics over time. The bare soil cube is tested by predicting the SOC trends of cropland soils in Germany. Samples from the LUCAS campaigns, as well as from the Agricultural Soil Inventory in Germany (BZE-LW), are used to train and validate the models. The proposed approach has the potential to greatly improve the predictions of dynamic soil properties and to inform better management practices for sustainable land use, climate protection, and policymaking. Furthermore, the use of bare soil data cubes allows for efficient storage, retrieval, and analysis of large amounts of analysis-ready data, making it a practical and scalable approach for soil monitoring on a regional or global scale.
Accurate mapping and monitoring of land dynamics is critical for climate change mitigation, biodiversity conservation, and epidemic prevention. With increasing data availability and processing capacities, there is a growing desire to move from periodic mapping of forest disturbances to continuous monitoring systems capable of providing timely information on forest disturbances to a variety of stakeholders. Many algorithms and approaches have been proposed in the research community to address this near real-time monitoring challenge. Their performance is typically demonstrated based on case studies over test areas or simulated datasets. However, when it is available, the software provided with the research papers only offers limited operational capacity. Individual software is often primarily developed to support the research experiment and consequently not optimized for speed or deployment at scale. In addition, implementation in different programming languages or the absence of a common interface to operate the algorithm make interoperability and comparisons exercises challenging. Inspired by the great success of scikit-learn that provides a standard interface to a large pool of optimized models and is widely acknowledged as the de-facto standard for machine learning in python, we developed the nrt python package. The package is designed for near real-time monitoring of disturbances in satellite image time-series. Five monitoring algorithms from the scientific literature on change detection (EWMA, CuSum, MoSum, CCDC, IQR) are implemented and exposed via a common API (Application Programming Interface). All the provided algorithms are optimized for fast computation thanks to the use of vectorized expressions or numba's JIT (Just In Time) compiler. Additionally, ongoing monitoring instances can be saved to disk and reloaded anytime, allowing for effortless operational deployment. The presentation will detail the functionalities of the package, the characteristics of the implemented algorithms, and illustrate its potential via a regional deployment covering several Sentinel2 tiles. The supporting ecosystem of tools, composed of dashboards for interactive use, parameters selection, and exploration of generated alert will also be presented.
The United Nations (UN) 2030 Agenda aims at promoting sustainable development at environmental, social and economic level. The definition of the Sustainable Development Goals (SDGs) and of the associated Global Indicator Framework represent a data-driven effort, helping countries in evidence-based decision-making and policies. SDG indicators’ monitoring and reporting across countries can benefit from substantial use of Earth Observation (EO), including satellite and in-situ networks, and of their processing through data analytics and numerical modeling approaches, making the 2030 Agenda implementation robust, viable and faster, both technically and financially.
This talk introduces SDGs-EYES, a major new European initiative aiming at boosting the European capacity for monitoring the UN SDGs. SDGs-EYES addresses current gaps in the UN SDGs monitoring by exploiting data and information coming from the European Copernicus Programme and by providing a scientific and technological platform for building indicators through the integration of EO data, advanced numerical modeling, data analytics and Machine Learning approaches. Furthemore, SDGs-EYES aims to build a portfolio of decision-making products and services for the assessment and monitoring of SDG indicators whose trends could impact the environment and the society from an inter-sectoral perspective, aligning with the EU Green Deal priorities and challenges.
The SDGs-EYES scientific approach and framework are introduced and described with particular reference to three interconnected SDGs, specifically on climate (SDG13), ocean (SDG14) and land (SDG15). These SDGs are mostly focused on the biosphere as foundation of prosperity, development and co-benefits in the society and economy, but also relevant due to their nexus with additional SDGs, targets and indicators related to socio-economic and (geo)political factors (e.g., human health, environmental crimes, water and food insecurity, poverty, conflicts, displacements, migrations).
Five Pilots (encompassing EU and extra-EU regions) will be used to demonstrate and validate the SDGs-EYES approach and results, that is application-oriented scientific products, technological solutions and user-tailored services.
The Open-Earth-Monitor project aims to maximize the impact and uptake of FAIR environmental data. In the framework of the stakeholder engagement strategy, an online survey on FAIR environmental data was implemented to get a comprehensive picture of whether the geospatial community is aware of FAIR data principles and what importance is attached to each principle. During the workshop, first results will be presented and discussed with stakeholders. To collect their expectations and requirements for FAIR environmental data, different perspectives from the geospatial community should be represented in order to consider divergent opinions from data users and data providers.
Furthermore, the participants will get informed about the FAIR principles and further principles within the open data movement such as CARE (CARE Principles for Indigenous Data Governance) and TRUST (TRUST Principles for digital repositories). The CARE Principles go beyond FAIR to consider and protect the rights and interests of indigenous people and for this reason is also of great importance in the geospatial data context. The TRUST principles introduce a framework to develop best practices for digital repositories to provide access to resources and enable users to rely and manage the respective data. Thus, many of these principles are relevant for the ten GEO Data Management Principles (GEO DMP). The GEO DMP were specifically designed for geospatial and environmental data. They define the data management requirements to facilitate and share Open Data promptly and at minimum cost. Good data management implies a number of activities to ensure that data are discoverable and accessible, understandable, usable and maintained.
This workshop will show the current tools we have available at IIASA to generate reference data for training and validation to be used in ML models and to assess the accuracy and performance of the output products the different OEMC monitors are going to generate. We show how, on the one hand, these tools can be used by experts, as well as, the crowd or citizens. The tools being presented will be geo-wiki, picture pile as well as a newly developed google streetview in-situ tool. Examples of how these tools can be used for monitoring drivers of deforestation, improving forest management information as well as crop type mapping. Examples from previous projects and applications will be shown. We furthermore demo two deep dives on how citizen contributions in particular picture pile can help to collect reference data for the crop monitor. Some examples from an existing ESA project Crowd2Train will be featured. We furthermore show how potentially the near real-time forest disturbance monitor (RADD Alerts) can be used in combination with picture pile to, on the one hand, increase the confidence in those alerts and, on the other hand, how such approaches can potentially help to raise awareness about deforestation issues for the wider public.
Monitoring biodiversity in agricultural supply chains is a key metric to assess progress on the EU's Biodiversity Strategy and the Farm to Fork Policy. 10% of farms should be composed of 'high diversity landscape features' - habitats such as treelines, hedgerows, semi-natural grassland, forests, and wetlands. However, most land cover mapping benchmark datasets, such as the recent OpenEarthMap, fail to distinguish between agricultural land (pasture, arable, forestry) and the semi-natural vegetation that counts towards the 10% target. Thus, there is a lack of high-quality labelled data to develop models to measure progress towards important policy goals.
I tested commercial high (SPOT 6/7) and very high resolution (Pleiades) satellite images for their ability to pick up linear features in Irish farmland, and their ability to detect 10 landcover classes: Pasture, Semi-Natural Woodland, Conifer, Scrub, Hedgerow, Semi-Natural Grassland, Artificial, Bare Ground, Shadow and Other. A minimum resolution of 0.5m was required to accurately detect the linear features common to Irish Farmland. Due to a lack of high-quality masks that distinguish farmed from semi-natural vegetation, deep learning methods were unsuitable and so an object-based image analysis workflow was developed. The choice of segmentation parameters and number of segments were important to ensure objects captured the shape of landscape features while minimising the speckle effect and loss of resolution due to segment size. Cloudless Pleiades images for 40 farms distributed throughout the Republic of Ireland’s biogeographic regions were obtained for the summer of 2022. Each image underwent segmentation and segments were labelled according to the 10 classes above, with ~1000 points per image to ensure data were obtained from each biogeographic regions in Ireland, resulting in 80,000 data points for model development.
Various indices (NDVI, EVI 1-3, NDWI, GRVI, CVI, CCI, CIGreen) and textures (Grey-level Co-Occurrence Matrices and Local Binary Patterns) applied to indices and the pan chromatic band were added as additional features. A model comparison procedure was carried out, optimising for balanced accuracy, comparing random forests, support vector machines, multi-layer perceptron, kNN, and multi-class logistic regression. A minimum balanced accuracy of 80% was considered acceptable for monitoring purposes. Random forests performed best, with an out-of-sample balanced accuracy of 82%.
The modelling framework and dataset can be used to monitor progress towards the Green Deal targets, and pilots monitoring the biodiversity in large dairy-processors with significant supply chains will be presented. Further improvements such as annual change detection and extending to other European countries will be discussed.
Climate change, environmental degradation and invasive species represent imminent threats to biodiversity. Decision makers urgently need accurate and reliable information about status, trends and impacts. To do so data needs to be presented in an actionable and understandable format, including measures of uncertainty.
The emerging challenge is to produce synthesised data products that can be used further by ecologists for purposes such as distribution modelling and risk mapping, and that can be combined with other environmental data, such as climate and land use data. Within the Group on Earth Observations Biodiversity Observation Network (GeoBON), it has been proposed to create aggregated biodiversity “data cubes” with taxonomic (what), spatial (where) and temporal (when) dimensions (Kissling et al. 2018). The Biodiversity Building Blocks for Policy project (B-Cubed) will generate biodiversity data cubes at the desired scale, automatically, as often as needed and with minimal manual intervention. These cubes will be made available and citable using the Global Biodiversity Information Facility (GBIF) infrastructure. Aside from the technological challenges, there is a conceptual challenge to solve: how to deal with the taxonomic, temporal and spatial uncertainty of biodiversity occurrence data?
Taxonomic uncertainty manifests itself in the form of synonymy. By trusting a taxonomy backbone such as the GBIF Taxonomy Backbone, this source of uncertainty is reduced. The temporal uncertainty is typically lower than the granularity used for aggregation (e.g. year) and can typically be neglected. On the contrary, the spatial uncertainty cannot be neglected.
Most commonly, occurrences are either collected in square grids of various dimensions or as points with an uncertainty radius (Bloom et al. 2018). Therefore, occurrences are not defined as points, but as two dimensional shapes, typically squares or circles. These rarely fit to the same geographic grid systems for which environmental and landscape data are available. A common solution is to upscale the data to a coarse grid. However, this inevitably reduces the spatial resolution of the data, which may result in a loss of accuracy when using data for building indicators and models.
To account for spatial uncertainty we developed within the Tracking Invasive Alien Species project (TrIAS) an algorithm (Oldoni et al. 2020) to randomly choose a point within the square or the circle and assign the occurrence to the spatial cell this point belongs to. This could however produce slightly different results with every round of drafting occurrence cubes. By creating an ensemble of cubes, we could correctly propagate the uncertainty from the raw occurrence data to the calculation of summary statistics, such as the number of occupied grid cells by a species (observed occupancy). Using Monte Carlo simulations with synthetic data we aim to determine the ensemble size, i.e. the minimum number of cubes needed to robustly infer the average observed occupancy and its uncertainty.
Strap yourself in and join Martijn and Luka on an Epic journey through the EcoDataCube, an analysis-ready, totally open multidimensional spatiotemporal data cube covering most of Europe! After explaining how large amounts of 30m and 10m earth observation data was aggregated, gap-filled, and used to create 20 annual land cover maps with 43 classes at 30m resolution, they will show you how to access the 200+ cloud-optimized data sets yourself with your browser, GIS, and Python code.
Any leftover time will be used to discuss why no dataset is truly analysis-ready, how no map is perfect, and a collaborative attempt to create the ultimate land cover legend.
Climate change heavily impacts the management of natural parks and land reserves: the increase in temperature, the change in seasons’ rhythm and other factors affect the faunistic and green population balance of the parks and the actions that park management entities must undertake to mitigate the negative effects. By monitoring the vegetation status over time it is possible to create a model of interaction between the changed landscape and its users and to craft tools to support their management.
The talk will present the tools for the environmental management of natural parks developed by Fondazione Edmund Mach and Deda Next in project Highlander (https://highlanderproject.eu/). The main focus of the talk will be on the practical nature of the use cases covered by the tools and on the attention to usability that have been put in their design.
The front-end of the tools is a simple HTML interface, spatially enabled with OpenLayers. The back-end is more diverse, depending on the use case, but it includes several data elaboration scripts, GeoServer as the map server (https://geoserver.org/) and the FROST implementation (https://github.com/FraunhoferIOSB/FROST-Server) of OGC’s SensorThings API standard on IoT time series data (http://docs.opengeospatial.org/is/15-078r6/15-078r6.html).
The use cases covered by the tools are the following:
-
Mountain pasture monitoring
Remote sensing data is used to calculate Spectral Vegetation Indices changes across different years or during the same mountain pasture season, providing useful information for a more sustainable pasture management. -
Tree species classification and above-ground biomass prediction
Airborne remote sensing data and field data are combined in order to produce tree species and aboveground biomass maps, estimated for each individual tree crown. -
Physiological monitoring of trees
Real-time high-frequency measurements are provided at single-tree level by TreeTalker sensors. Data gathered (including leaf reflectance, trunk growth, water usage, soil and stem humidity, air temperature and plant stability) can be used to understand the real-time response of trees to climate. -
Forest windthrows detection and damages estimation
Windthrows maps are produced from high-resolution satellite images, using as test event the storm occurred in Vaia, northeastern Italy, at the end of October 2018 with wind gusts of 200 km/h. -
Grassland mowing detection
The detection of mowing frequency is based on time series analysis of vegetation indexes derived from satellite imagery and provides an assessment at parcel level that can be compared with ground surveys. -
Bark beetle detection and forest stress monitoring
Many bark beetle species feed on weakened, dying or dead spruce, fir and hemlock. Thus the massive amount of fallen trees due to storm events represents an high risk condition for proliferation. This tool estimates the locations most impacted by bark beetle proliferation, providing also a confidence level.
African forest are increasingly in decline as a result of land-use conversion due to human activities. However, a consistent and detailed characterization and mapping of land-use change that results in forest loss is not available at the spatial-temporal resolution and thematic levels suitable for decision-making at the local and regional scales; so far they have only been provided on coarser scales and restricted to humid forests. Here we present the first high-resolution (5 m) and continental-scale mapping of land use following deforestation in Africa, which covers an estimated 13.85% of the global forest area, including humid and dry forests. We use reference data for 15 different land-use types from 30 countries and implement an active learning framework to train a deep learning model for predicting land-use following deforestation with an F1-score of 84% for the whole of Africa. Our results show that the causes of forest loss vary by region. In general, small-scale crop-land is the dominant driver of forest loss in Africa, with hotspots in Madagascar and DRC. In addition, commodity crops such as cacao, oil palm, and rubber are the dominant drivers of forest loss in the humid forests of western and central Africa, forming an "arc of commodity crops" in that region. At the same time, the hotspots for cashew are found to increasingly dominate in the dry forests of both western and south-eastern Africa, while larger hotspots for large-scale croplands were found in Nigeria and Zambia. The increased expansion of cacao, cashew, oil palm, rubber, and large-scale croplands observed in humid and dry forests of western and south-eastern Africa suggests they are vulnerable to future land-use changes by commodity crops, thus creating challenges for achieving the zero deforestation supply chains, support REDD+ initiatives, and towards sustainable development goals.
The presenters will demonstrate how to use available future projections of climatic data across different climate change scenarios to forecast how Earth's surface will look in the future. The presentation will be equally balanced between theory and practice: the theoretical part will provide an overview of the Coupled Model Intercomparison Project, focusing on the models produced for IPCC AR5, and of spatiotemporal modeling of vegetation. The practical part will explain how to combine Earth Observation data and machine learning to produce maps of the major biomes for the future and how their distribution is expected to change according to the different climate change scenarios.
Central Europe experienced a series of droughts and heat waves between 2018 and 2020 which severely effected the forest ecosystems.The canopy cover loss has been mapped for Germany by [1] via the use of high spatial optical images from the Sentinel-2 and Landsat-8 satellites.In this contribution we want to present the results of assessing deforestation with a complementary approach using Sentinel-1 C-Band SAR data. We use the Recurrence Quantification Analysis (RQA) to derive a change metric which takes the order of the time series into account [2]. This approach provides high resolution yearly forest loss maps based on a continuous data stream.
In addition to the scientific results we showcase the processing pipeline on the European Open Science Cloud. The amount of high resolution earth observation data processed in this study was too large to do all analysis on local computers or even local cluster systems. To achieve high performance computations for out-of-memory datasets we develop the YAXArrays.jl package in the Julia programming language. YAXArrays.jl provides both an abstraction over chunked n-dimensional arrays with labelled axes and efficient multi-threaded and multi-process computation on these arrays.
Citation:
[1]: Thonfeld, F.; Gessner, U.; Holzwarth, S.; Kriese, J.; da Ponte, E.; Huth, J.; Kuenzer, C.A First Assessment of Canopy Cover Loss in Germany’s Forests after the 2018–2020 Drought Years.
Remote Sens. 2022, 14, 562. https://doi.org/10.3390/rs14030562
[2]:F. Cremer, M. Urbazaev, J. Cortés, J. Truckenbrodt, C. Schmullius and C. Thiel,
"Potential of Recurrence Metrics from Sentinel-1 Time Series for Deforestation Mapping,"
in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 13, pp. 5233-5240, 2020, https://doi.org/10.1109/JSTARS.2020.3019333
The OpenClimate Network is an open source nested accounting platform allows users to navigate emissions inventories and climate pledges of different actors at every level, aggregating data from various public sources for countries, regions, cities and companies. Through this aggregation, it enables the comparison of how different data sources report emissions of certain actors, by harmonizing the way data is reported and identifying the different methodologies used.
Additionally, by nesting actors into their respective jurisdictions it facilitates the comparison between the pledges these actors have committed to, and to see if they are aligned towards the same climate targets, and how these compare to the goals of the Paris Agreement.
By aggregating data and exploring it in this nested manner, it also allows for the effective identification of data gaps for these actors, suggesting where efforts are needed to identify existing data sources or help produce new inventories. When data gaps are identified, the platform also prompts users to contribute data based on the open and standardized data model used to aggregate emissions and pledges data.
Spatial data can be a key component in tackling double counting, building subnational emissions inventories and accounting for corporate emissions.
Land use monitoring using machine learning and Earth observation data is usually challenging due to the lack of training samples, especially for large areas and long periods where gathering in-situ information is costly or sometimes impossible. This work proposes a machine learning approach called Time-Weighted Dynamic Time Warping (TWDTW) for data-scarce applications. TWDTW is a satellite image time series classification algorithm that uses a Dynamic Time Warping (DTW) distance. DTW is a widely used algorithm in various fields, including speech recognition, medicine, industry, and finance, and has shown promising results in land use mapping due to its ability to deal with gaps in time series, robustness to noise, matching time series of different lengths and intervals, and to keep its classification performance on small training sets.
However, DTW has limitations in matching events regardless of when they occur, which can result in out-of-season alignments and misclassifications—for example, aligning a summer crop to a winter one. TWDTW overcomes this limitation by introducing a time weight to matches deviating from an expected date in the training set. This temporal constraint improves classification performance by controlling for out-of-season alignments while keeping DTW's flexibility to smaller phenological fluctuations of vegetation.
This presentation will demonstrate the effectiveness of the TWDTW method for land use classification using the open-source R package dtwSat. Overall, this machine learning method is suitable for data-scarce regions and can contribute to land use monitoring, supporting the environmental targets proposed by the European Green Deal and the United Nations' Sustainable Development Goals.
As we enter a period of unprecedented global environmental crises, the importance of environmental research has never been more evident. The new keywords of our time are increasingly worrying: rapid change, adaptation, resilience, tipping points, collapse.
Managing the transition of our societies to new global climates is a major challenge for decision-makers at national and international levels. How to design, implement and monitor effective policies that are economically and socially sustainable in the short term, and environmentally effective in the long term? Over the past centuries, science has provided the knowledge that has led to the current environmental crises (e.g. from the industrial revolution and the use of fossil fuels to the widespread use of chemicals). Today, science is the only way to support a knowledge-based transformation of our energy systems, economy and society in an environmentally and climate-sustainable way.
Promoting knowledge-based transformations is a road full of obstacles that more open science, FAIR data, shared models and interdisciplinary collaboration will help to overcome.
To name a few: the intrinsic complexity of the Earth system, the uncertainty of predictions due to the interconnectedness of the system and the non-linearity of its responses, the mismatch between the time and time scales of actions and effects, and, on the social side, the conflicts and polarisation fostered by the media.
For each of these complexities that limit our ability to deal with environmental crises, to mitigate them, and to understand their impacts, ideas and examples will be presented on how Open Data and Open Science are contributing, and can contribute further.