Open Earth Monitor — Global Workshop 2024

Steffen Ehrmann

I am Landscape ecologists working at the German Centre for Integrative Biodiversity Research (iDiv)
Halle-Jena-Leipzig and at the International Institute for Applied Systems Analysis (IIASA), contributing to the Global Pasture Watch (GPW) project and designing the LUCKINet computational workflow.

The speaker's profile picture

Do you accept that a video-recording of your talk is published under CC-BY license via https://av.tib.eu? – yes

Sessions

10-02
15:30
30min
Exploring bitfields for spatially explicit metadata processing and reuse
Steffen Ehrmann

Computational workflows in the earth system sciences are becoming increasingly sophisticated, where data of different types and sources are integrated into large-scale, modelled data products. This is partly a consequence of a competition-driven diversification of tools and approaches, with the desirable side effect that we learn more about the earth's spheres from more distinct perspectives. Ideally, sophisticated and complex workflows are better at mapping the sophisticated interaction networks on our planet with less ambiguity. However, the reality is that practical considerations or a lack of resources or time in our projects demand non-ideal decisions, and how that impacts results often needs to be clarified. We try to quantify the errors of our output, and software engineering uses so-called unit tests, where the output of "smallest units" of code are compared against expected results.

While error reporting (of the output) is part of best practice in the earth sciences, analysis of intermediate data typically happens project-internally but is rarely reported. Provenance data are sometimes collected to document the workflow, but these data are reusable even less frequently as they serve the purpose of archiving. Each project has a computational footprint and much wisdom can be found in computational workflows. Intermediate data of one project are often the starting point of another project.

The bitfield R-package tries to fill this gap in the toolchain. With the help of those tools, one can produce (simple) tests that document binary responses, cases or numeric values in sequences of bits encoded as integers. The resulting data could be called meta-analytic or meta-algorithmic data because they allow documenting (and re-using) an analysis or algorithm spatially explicitly. Depending on the documentation detail, this could approximate provenance graphs or at least make "snapshots" of a workflow available as intermediate data. The bitfield is a promising data structure already employed in the MODIS quality flag that allows vast information to be stored in a single integer. In this workshop, you will learn how to use the tools in bitfield, get an introduction to the software logic, and we may discuss possible use cases and the future of this technology.

Raiffa Room (IIASA)
10-03
13:50
20min
Determining the socio-environmental niche for agricultural land suitability assessment
Steffen Ehrmann

Given that food and shelter are basic human needs, it is not surprising that much scientific focus goes into understanding and modelling plant production systems for nutrition and industrial needs. It is crucial to understand where and when specific crops are most suitable to grow now and in the (near-term) future under risk scenarios [1, 2].

Agricultural land suitability assessment models [3] are based on complex biophysical and human-nature interactions, and traditional approaches based on mechanistic plant growth models may be insufficient to map actual patterns because they miss socio-economic processes [4]. Farmers need to follow decision processes that result in their continued survival in the system [5], which can often lead to "the most suitable" location not being selected. While it is evident that access to technology and resources consumed by crops shape land-use patterns, market and regulatory considerations need to be followed as well, which are notoriously hard to model spatially explicitly.

We have devised a modelling framework to build a spatiotemporal series of crop suitabilities annually between 2000 and 2020 at a resolution of 1km², addressing the circa 170 FAOstat crop types. We informed our modelling framework with a wide range of datasets indicating the occurrence of crops (in-situ occurrences, areal sub-national census data, parcel-level polygons, etc.) matched with an advanced, globally harmonised, hierarchical crop-type ontology. We then complement this knowledge graph with the mechanistic crop growth information to apply an environmental filtering approach and impute a range of additional indications of crop presence and absence. Finally, we use bio-physical and socio-economic predictors of crop production in a deliberate multi-label random forest model, resulting in the realised socio-environmental niche of crops.

Maria Theresia Seminar room (Conference Center Laxenburg)