OpenGeoHub Summer School 2023

Tidy geographic data with sf, dplyr, ggplot2, geos and friends (part 1)
2023-08-28, 11:00–12:30, Room 21 (Sala 21)

This lecture will provide an introduction to working with geographic data using R in a ‘tidy’ way. It will focus on using the sf package to read, write, manipulate, and plot geographic data in combination with the tidyverse metapackage. Why use the sf package with the tidyverse? The lecture will outline some of the ideas underlying the tidyverse and how they can speed-up data analysis pipelines, while making data analysis code easier to read and write. We will see how the following lines

library(sf)
library(tidyverse)

can provide a foundation on which the many geographic data analysis problems can be solved. The lecture will also cover on more recently developed packages that integrate with the tidyverse to a greater and lesser extent. We will look at how the geos package, which provides a simple and high-performance interface to the GEOS library for performing geometric operations on geographic data, integrates with the tidyverse. The tidyverse is not the right tool for every data analysis task and we touch on alternatives for working with raster data, with reference to the terra package, and alternative frameworks such as data.table. Finally, we will also look at how the ‘tidy’ philosophy could be implemented in other programming languages, such as Python.

The focus throughout will be on practical skills and using packages effectively within the wider context of project management tools, integrated development environments (we recommend VS Code with appropriate extensions or RStudio), and version control systems.


What is your current associations to EU Horizon projects (if any)? Please provide URL that you plan to use to distribute your materials (if available).

https://github.com/robinlovelace/opengeohub2023

Robin Lovelace is Associate Professor of Transport Data Science at the Leeds Institute for Transport Studies (ITS) and Head of Data at the government agency Active Travel England. Robin specializes in geocomputation with a focus on developing geographic methods applied to modeling transport systems, active travel, and decarbonisation. Robin has experience not only researching but deploying transport models in inform sustainable policies and more effective use of transport investment, including as Lead Developer of the Propensity to Cycle Tool (see www.pct.bike), the basis of strategic cycle network plans nationwide. Robin has led numerous data science projects for organizations ranging from the Department for Transport to the World Bank.

Robin is author of popular open source software packages including R packages stplanr, stats19 and abstr. He has authored three reproducible and open source textbooks, Microsimulation with R, Efficient R Programming, and Geocomputation with R.

This speaker also appears in: