Open Earth Monitor — Global Workshop 2024

Exploring bitfields for spatially explicit metadata processing and reuse
2024-10-02, 15:30–16:00, Raiffa Room (IIASA)

Computational workflows in the earth system sciences are becoming increasingly sophisticated, where data of different types and sources are integrated into large-scale, modelled data products. This is partly a consequence of a competition-driven diversification of tools and approaches, with the desirable side effect that we learn more about the earth's spheres from more distinct perspectives. Ideally, sophisticated and complex workflows are better at mapping the sophisticated interaction networks on our planet with less ambiguity. However, the reality is that practical considerations or a lack of resources or time in our projects demand non-ideal decisions, and how that impacts results often needs to be clarified. We try to quantify the errors of our output, and software engineering uses so-called unit tests, where the output of "smallest units" of code are compared against expected results.

While error reporting (of the output) is part of best practice in the earth sciences, analysis of intermediate data typically happens project-internally but is rarely reported. Provenance data are sometimes collected to document the workflow, but these data are reusable even less frequently as they serve the purpose of archiving. Each project has a computational footprint and much wisdom can be found in computational workflows. Intermediate data of one project are often the starting point of another project.

The bitfield R-package tries to fill this gap in the toolchain. With the help of those tools, one can produce (simple) tests that document binary responses, cases or numeric values in sequences of bits encoded as integers. The resulting data could be called meta-analytic or meta-algorithmic data because they allow documenting (and re-using) an analysis or algorithm spatially explicitly. Depending on the documentation detail, this could approximate provenance graphs or at least make "snapshots" of a workflow available as intermediate data. The bitfield is a promising data structure already employed in the MODIS quality flag that allows vast information to be stored in a single integer. In this workshop, you will learn how to use the tools in bitfield, get an introduction to the software logic, and we may discuss possible use cases and the future of this technology.


What is your current associations to EU Horizon projects (if any)?

Other

Please provide URL that you plan to use to distribute your materials (if available).

https://github.com/EhrmannS/bitfield

I am Landscape ecologists working at the German Centre for Integrative Biodiversity Research (iDiv)
Halle-Jena-Leipzig and at the International Institute for Applied Systems Analysis (IIASA), contributing to the Global Pasture Watch (GPW) project and designing the LUCKINet computational workflow.

This speaker also appears in: