RStudio project
Open the RStudio project that we created in the previous session. I recommend to use this RStudio project for the entire module and within the RStudio project create separate R scripts for each session.
# Session 1: Spatial data in R
and save the file in your
folder “scripts” within your project folder, e.g. as
“1_SpatialData.R”Nowadays, R offers a lot of GIS functionalities in different packages.
We will use a number of such GIS packages in R. Remember that we can
install new packages with the function install.packages()
.
Make sure also to install the dependencies.
install.packages('terra, dep=T)
# load required package
library(terra)
Spatial data are typically stored as either vector data or raster data (With 2019). Discrete objects with clear boundaries are usually represented by vector data (e.g. individual trees, roads, countries) while continuous phenomena without clear boundaries are usually represented by raster data (e.g. elevation, temperature).
Vector data can be mapped a points, lines, and polygons:
By contrast, Raster data represent the landscape as a regularly spaced grid and are used to represent data that vary continuously across space such as elevation, temperature or NDVI. Raster cells can contain numeric values (e.g. elevation) or categorical values (e.g. land cover type). The coordinate information is stored differently from vector data because storing the coordinates for each grid cell in the raster would use too much storage space. To define a raster, we only need to know the coordinates of one corner, the spatial extent and the spatial resolution to infer the coordinates of each cell.
The package terra
provides a set of classes and methods
to carry out spatial data analysis with raster and vector data. For
illustration, we will start by creating a few SpatVector
objects.
For more detailed tutorials see https://rspatial.org/terra/.
For illustration, we will generate some random points data and
convert them into Spat*
objects.
# We set a seed for the random number generator, so that we all obtain the same results
set.seed(12345)
coords <- cbind(
x <- rnorm(50, sd=2),
y <- rnorm(50, sd=2)
)
# sort coords data along x coordinates
coords <- coords[order(coords[,1]),]
# look at data structure
str(coords)
## num [1:50, 1:2] -4.76 -3.64 -3.32 -3.2 -3.11 ...
# plot the coordinates
plot(coords)
We now convert the random points into a SpatVector
object and inspect it.
# Convert into SpatVector
sp <- terra::vect(coords)
# Check out the object
class(sp)
## [1] "SpatVector"
## attr(,"package")
## [1] "terra"
# And the entries:
sp
## class : SpatVector
## geometry : points
## dimensions : 50, 0 (geometries, attributes)
## extent : -4.760716, 4.393667, -4.693888, 4.954222 (xmin, xmax, ymin, ymax)
## coord. ref. :
geom(sp)
## geom part x y hole
## [1,] 1 1 -4.7607161 -1.7101650 0
## [2,] 2 1 -3.6359119 0.5559074 0
## [3,] 3 1 -3.3241005 -0.6159068 0
## [4,] 4 1 -3.1954190 2.8054108 0
## [5,] 5 1 -3.1062748 4.6610239 0
## [6,] 6 1 -2.8261976 -1.0100870 0
## [7,] 7 1 -2.6135977 -1.3890934 0
## [8,] 8 1 -2.1205311 3.7738939 0
## [9,] 9 1 -1.8386440 -4.6938880 0
## [10,] 10 1 -1.7727150 1.7762789 0
## [11,] 11 1 -1.5010640 -1.1737592 0
## [12,] 12 1 -1.2886569 -2.0987056 0
## [13,] 13 1 -0.9632947 1.6525166 0
## [14,] 14 1 -0.9069943 0.7033257 0
## [15,] 15 1 -0.6631552 3.1869769 0
## [16,] 16 1 -0.6481732 1.3440849 0
## [17,] 17 1 -0.5683195 4.2901300 0
## [18,] 18 1 -0.5523682 1.6475907 0
## [19,] 19 1 -0.3246220 2.0425168 0
## [20,] 20 1 -0.2324956 0.2991840 0
## [21,] 21 1 -0.2186066 0.1071805 0
## [22,] 22 1 0.0516021 1.6497401 0
## [23,] 23 1 0.5085424 1.9424413 0
## [24,] 24 1 0.5974474 -2.5913434 0
## [25,] 25 1 0.7412557 1.1066062 0
## [26,] 26 1 0.9823766 3.7341984 0
## [27,] 27 1 1.0404329 3.1799257 0
## [28,] 28 1 1.1348065 4.3154396 0
## [29,] 29 1 1.1663753 -1.1995951 0
## [30,] 30 1 1.1710576 -1.0807721 0
## [31,] 31 1 1.2117749 -1.3419531 0
## [32,] 32 1 1.2242470 0.9524966 0
## [33,] 33 1 1.2407596 -1.6230810 0
## [34,] 34 1 1.2601971 1.3823425 0
## [35,] 35 1 1.4189320 3.8953853 0
## [36,] 36 1 1.5592438 0.1092312 0
## [37,] 37 1 1.6237464 1.2907661 0
## [38,] 38 1 1.6337997 -3.6647546 0
## [39,] 39 1 1.7089034 -1.9612659 0
## [40,] 40 1 1.8742811 -0.7836387 0
## [41,] 41 1 2.2414253 1.0337093 0
## [42,] 42 1 2.2570217 -1.9278030 0
## [43,] 43 1 2.9115702 -1.5692987 0
## [44,] 44 1 2.9214588 1.3746642 0
## [45,] 45 1 3.2648913 4.9542218 0
## [46,] 46 1 3.5354677 1.0730474 0
## [47,] 47 1 3.6101950 1.8852017 0
## [48,] 48 1 3.6346241 -2.6850630 0
## [49,] 49 1 4.0983807 -0.6087382 0
## [50,] 50 1 4.3936671 2.0862871 0
We see that it is a vector data object with point geometries. An
extent (the bounding box) was created automatically from the provided
coordinates. The coord. ref.
value is empty as we haven’t
provided a coordinate reference system (“CRS”) yet. We can explicitly
add a CRS when creating the SpatVector
object:
sp <- terra::vect(coords, crs = '+proj=longlat +datum=WGS84')
sp
## class : SpatVector
## geometry : points
## dimensions : 50, 0 (geometries, attributes)
## extent : -4.760716, 4.393667, -4.693888, 4.954222 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
At the moment the SpatVector
object only contains the
point coordinates without any additional thematic information (or
attributes) for each point. Let’s assume the spatial points are trees,
either of three species. The 15 western-most points are all birch trees,
the eastern trees are a mix of beach and oak.
# Create attribute table
data <- data.frame(ID=1:50,species= NA)
# Define the 15 western trees as birch
data[1:15, 'species'] <- 'birch'
# Randomly assign the species beach or oak to the remaining 35 trees
data[-c(1:15), 'species'] <- sample(c('beech','oak'),35,replace=T)
Now, we combine the SpatVector
object and the data frame
and store it in a new SpatVector
object.
# in terra, you can simply add the additional information to the existing Spatial Vector
sp_trees <- sp
terra:: values(sp_trees) <- data
# Inspect the new object
sp_trees
## class : SpatVector
## geometry : points
## dimensions : 50, 2 (geometries, attributes)
## extent : -4.760716, 4.393667, -4.693888, 4.954222 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
## names : ID species
## type : <int> <chr>
## values : 1 birch
## 2 birch
## 3 birch
Similarly, we can also create SpatVector
of lines or
polygons. First, we want to create a line vector object from the oaks -
we could assume that it is an oak hedgerow.
# prepare a subset of point geometries for oaks
oak <- terra::subset(sp_trees, sp_trees$species == 'oak')
# extract oak coordinates
oak_coords <- terra::crds(oak)
# Create SpatVector lines through all oak trees in the data
oak_hedge <- terra::vect(oak_coords, type="lines", crs='+proj=longlat +datum=WGS84')
# Inspect the object
oak_hedge
## class : SpatVector
## geometry : lines
## dimensions : 1, 0 (geometries, attributes)
## extent : -0.6481732, 4.393667, -3.664755, 2.086287 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
Next, we create a spatial polygon vector from the birch data - we could assume that it is a birch woodland patch.
# prepare a subset of point geometries for birch
birch <- terra::subset( sp_trees, sp_trees$species == 'birch')
# extract birch coordinates
birch_coords <- terra::crds(birch)
# Create a spatial polygon around all birch trees in the data by using a convex hull:
birch_hull <- chull(birch_coords)
# Create SpatVector polygons
birch_patch <- terra::vect(birch_coords[birch_hull,], type = "polygons", crs='+proj=longlat +datum=WGS84')
# inspect the object
birch_patch
## class : SpatVector
## geometry : polygons
## dimensions : 1, 0 (geometries, attributes)
## extent : -4.760716, -0.6631552, -4.693888, 4.661024 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
Finally, let’s plot our spatial point, line and polygon data.
# We first plot the bird woodland patch but with the spatial extent of sp_trees (plus some buffer):
plot(birch_patch, border='blue',col='YellowGreen', ext=ext(sp_trees)*1.1, lwd=2)
# Add the oak hedge
plot(oak_hedge, col='ForestGreen', lwd=2, add=T)
# Add the trees
plot(sp_trees, cex=2, add=T)
Most often, vector data are stored as shapefiles. Download the zip folder containg IUCN range data of the Alpine Shrew to your data folder and unzip it. You will see that several files are contained in this folder. All of these are necessary parts of the shapefile and contain the different information on geometry and attributes.
We use the terra
package to read in the data:
(shrew <- terra::vect('data/IUCN_Sorex_alpinus.shp'))
## class : SpatVector
## geometry : polygons
## dimensions : 1, 27 (geometries, attributes)
## extent : 5.733728, 26.67935, 42.20601, 51.89984 (xmin, xmax, ymin, ymax)
## source : IUCN_Sorex_alpinus.shp
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
## names : id_no binomial presence origin seasonal compiler yrcompiled
## type : <chr> <chr> <int> <int> <int> <chr> <int>
## values : 29660 Sorex alpinus 1 1 1 IUCN 2008
## citation source dist_comm (and 17 more)
## <chr> <chr> <chr>
## IUCN (Internat~ NA NA
# Plot Central Europe
library(maps)
map('world',xlim=c(5,30), ylim=c(40,55))
# Overlay the range of the Alpine Shrew
plot(shrew, col='red', add=T)
We will use the terra
package to represent and analyse
raster data in R. The package contains raster data class
SpatRaster
, which can contain only a single layer of raster
information or multiple layers (from separate files or from a single
multi-layer file, respectively).
The function rast()
can be used to create or read in
SpatRaster
objects.
# Create a SpatRaster object
(r1 <- terra::rast(ncol = 10, nrow = 10, xmax = -80, xmin = -150, ymax = 60, ymin = 20 ))
## class : SpatRaster
## dimensions : 10, 10, 1 (nrow, ncol, nlyr)
## resolution : 7, 4 (x, y)
## extent : -150, -80, 20, 60 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
We can access the attributes of each grid cell by using the function
values()
. Obviously, there are no values yet in the
RasterLayer
object and thus we assign some randomly.
# Inspect value in SpatRaster:
summary(terra::values(r1))
## lyr.1
## Min. : NA
## 1st Qu.: NA
## Median : NA
## Mean :NaN
## 3rd Qu.: NA
## Max. : NA
## NA's :100
# Assign values (randomly drawn from normal distribution) to the SpatRaster object:
terra::values(r1) <- rnorm(ncell(r1))
# plot the raster
plot(r1)
We can easily create a multi-layer SpatRaster
object
using the c()
(concatenate) function.
# Create another RasterLayer and assign values
r2 <- r1
terra::values(r2) <- 1:ncell(r2)
# Create multi-layer SpatRaster object
(r <- c(r1, r2))
## class : SpatRaster
## dimensions : 10, 10, 2 (nrow, ncol, nlyr)
## resolution : 7, 4 (x, y)
## extent : -150, -80, 20, 60 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84
## source(s) : memory
## names : lyr.1, lyr.1
## min values : -2.156362, 1
## max values : 2.593773, 100
names(r) <- c('Layer 1', 'Layer 2')
plot(r)
In most cases, we will read in raster data from file. For this we can
use the same commands as above, rast()
for reading in both
single-layer and multi-layer raster files.
First, download the temperature map for UK here to your data folder and read it in. The raster layer represents the mean annual in temperature [°C * 10].
# Read in SpatRaster object from file:
(temp <- terra::rast('data/UK_temp.tif'))
## class : SpatRaster
## dimensions : 122, 66, 1 (nrow, ncol, nlyr)
## resolution : 10000, 10000 (x, y)
## extent : 0, 660000, 0, 1220000 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +units=m +no_defs
## source : UK_temp.tif
## name : UK_temp
## min value : 35.71785
## max value : 112.00000
plot(temp)
Second, download the GeoTiff file containing multi-layer raster data for 19 bioclimatic variables of UK, and place it in your data folder. The bioclimatic variables are explained herehttps://www.worldclim.org/data/bioclim.html.
# Reading multi-layer SpatRaster data from file uses the same command as for single-layer data:
(bioclim <- terra::rast('data/UK_bioclim.tif'))
## class : SpatRaster
## dimensions : 122, 66, 19 (nrow, ncol, nlyr)
## resolution : 10000, 10000 (x, y)
## extent : 0, 660000, 0, 1220000 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +units=m +no_defs
## source : UK_bioclim.tif
## names : bio1, bio2, bio3, bio4, bio5, bio6, ...
## min values : 35.71785, 41.00000, 30, 3060.245, 134.0000, -55.48918, ...
## max values : 112.00000, 82.64219, 40, 5321.534, 231.5805, 43.00000, ...
plot(bioclim)
The geodata
package is offering direct access to some
standard repositories. Check them out in the help page
?geodata
. The package provides access, for example, to
elevation data, data on the global administrative boundaries (gadm) as
well as current and future climates from WorldClim (Hijmans et al. 2005).
## class : SpatRaster
## dimensions : 1080, 2160, 19 (nrow, ncol, nlyr)
## resolution : 0.1666667, 0.1666667 (x, y)
## extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
## sources : wc2.1_10m_bio_1.tif
## wc2.1_10m_bio_2.tif
## wc2.1_10m_bio_3.tif
## ... and 16 more source(s)
## names : wc2.1~bio_1, wc2.1~bio_2, wc2.1~bio_3, wc2.1~bio_4, wc2.1~bio_5, wc2.1~bio_6, ...
## min values : -54.72435, 1.00000, 9.131122, 0.000, -29.68600, -72.50025, ...
## max values : 30.98764, 21.14754, 100.000000, 2363.846, 48.08275, 26.30000, ...
library(geodata)
# Download global bioclimatic data from worldclim (you may have to set argument 'download=T' for first download, if 'download=F' it will attempt to read from file):
(clim <- geodata::worldclim_global(var = 'bio', res = 10, download = F, path = 'data'))
plot(clim)
The terra
package offers functionality to manipulate the
spatial data, for example aggregating the data to coarser resolutions
(aggregate()
), or cropping to a specific extent
(crop()
).
# Rough coordinates of European extent
ext_eur <- c(-15,45,35,72)
# Crop the temperature layer (bio1) to roughly European extent
temp_eur <- terra::crop(clim[[1]], ext_eur)
# Aggregate to one-degree and two-degree resolution (from currently 10-min resolution)
temp_eur_onedeg <- terra::aggregate(temp_eur, fact=6)
temp_eur_twodeg <- terra::aggregate(temp_eur, fact=12)
# Plot the resultin maps
par(mfrow=c(1,3))
plot(temp_eur)
plot(temp_eur_onedeg)
plot(temp_eur_twodeg)
# Test other options like the na.rm argument:
# Aggregate to one-degree and two-degree resolution (from currently 10-min resolution)
temp_eur_onedeg <- terra::aggregate(temp_eur, fact=6, na.rm=T)
temp_eur_twodeg <- terra::aggregate(temp_eur, fact=12, na.rm=T)
# Plot the resultin maps
par(mfrow=c(1,3))
plot(temp_eur)
plot(temp_eur_onedeg)
plot(temp_eur_twodeg)
There are different ways for extracting information from raster
layers. We have already worked with values()
. If we have
coordinate data, we can use these coordinates to “pierce” through raster
layers. That’s one of the easiest ways to extract relevant environmental
data for specific locations. Let’s first extract the climate data for
Potsdam.
# Potsdam coordinates - we need these as two-column matrix (or data.frame or SpatVector)
lonlat_Potsdam <- matrix( c(13.063561, 52.391842), ncol=2)
# Extract temperature values for Potsdam:
terra::extract(temp_eur, lonlat_Potsdam)
## wc2.1_10m_bio_1
## 1 9.229239
# Extract climate values for Potsdam:
terra::extract(clim, lonlat_Potsdam)
## wc2.1_10m_bio_1 wc2.1_10m_bio_2 wc2.1_10m_bio_3 wc2.1_10m_bio_4
## 1 9.229239 8.012188 30.85883 687.4697
## wc2.1_10m_bio_5 wc2.1_10m_bio_6 wc2.1_10m_bio_7 wc2.1_10m_bio_8
## 1 23.6355 -2.3285 25.964 17.79533
## wc2.1_10m_bio_9 wc2.1_10m_bio_10 wc2.1_10m_bio_11 wc2.1_10m_bio_12
## 1 4.618125 17.79533 0.94125 546
## wc2.1_10m_bio_13 wc2.1_10m_bio_14 wc2.1_10m_bio_15 wc2.1_10m_bio_16
## 1 65 34 20.14255 171
## wc2.1_10m_bio_17 wc2.1_10m_bio_18 wc2.1_10m_bio_19
## 1 111 171 127
For further illustration, we want to extract raster data for several locations, specifically for different European capitals.
capitals <- data.frame(lon=c(2.333333, -3.683333, -0.083333, 12.583333, 10.75, 21, 23.316667, 24.1, 0),
lat=c(48.86666667, 40.4, 51.5, 55.66666667, 59.91666667, 52.25, 42.68333333, 56.95, 0),
capital=c('Paris', 'Madrid', 'London', 'Copenhagen', 'Oslo', 'Warsaw', 'Sofia', 'Riga','Oops'))
# Map temperature and locations
plot(temp_eur)
points(capitals[,1:2], cex=2, pch=19)
# Extract temperature values at these locations
terra::extract(temp_eur, capitals[,1:2])
## ID wc2.1_10m_bio_1
## 1 1 11.604875
## 2 2 14.344760
## 3 3 10.485146
## 4 4 8.415707
## 5 5 5.774378
## 6 6 8.199729
## 7 7 9.835594
## 8 8 6.548781
## 9 9 NA
# Combine data frames:
(capitals_clim <- cbind(capitals, terra::extract(temp_eur, capitals[,1:2])))
## lon lat capital ID wc2.1_10m_bio_1
## 1 2.333333 48.86667 Paris 1 11.604875
## 2 -3.683333 40.40000 Madrid 2 14.344760
## 3 -0.083333 51.50000 London 3 10.485146
## 4 12.583333 55.66667 Copenhagen 4 8.415707
## 5 10.750000 59.91667 Oslo 5 5.774378
## 6 21.000000 52.25000 Warsaw 6 8.199729
## 7 23.316667 42.68333 Sofia 7 9.835594
## 8 24.100000 56.95000 Riga 8 6.548781
## 9 0.000000 0.00000 Oops 9 NA
Note: if no environmental data are available for a specific location,
e.g. no temperature values are available in the ocean, then these
locations will receive an NA
value. Hence, you should
always check your resulting data for NA
s.
# Remove NAs in resulting data frame
na.exclude(capitals_clim)
## lon lat capital ID wc2.1_10m_bio_1
## 1 2.333333 48.86667 Paris 1 11.604875
## 2 -3.683333 40.40000 Madrid 2 14.344760
## 3 -0.083333 51.50000 London 3 10.485146
## 4 12.583333 55.66667 Copenhagen 4 8.415707
## 5 10.750000 59.91667 Oslo 5 5.774378
## 6 21.000000 52.25000 Warsaw 6 8.199729
## 7 23.316667 42.68333 Sofia 7 9.835594
## 8 24.100000 56.95000 Riga 8 6.548781
For the homework, you will need several objects that you should not forget to save.
# Write terra objects to file:
terra::writeRaster(temp_eur, 'data/temp_eur.tif')
terra::writeRaster(temp_eur_onedeg, 'data/temp_eur_onedeg.tif')
terra::writeRaster(temp_eur_twodeg, 'data/temp_eur_twodeg.tif')
# You will also need the SpatRaster "clim" but this was directly downloaded from geodata package, so is already on file
# Save other (non-terra) objects from the workspace:
save(capitals, file='data/1_SpatialData.RData')
As homework, solve all the exercises in the blue boxes.
Exercise:
Take the capital locations and extract also the temperature
values at the aggregated scales (temp_eur_onedeg
,
temp_eur_twodeg
). Compare the temperature values.
Take the capital locations and extract all 19 bioclimatic
variables for these (from the SpatRaster
object
clim
)
Find the coordinates of 10 other capital cities and extract the climate values
Take a look at the description of Coordinate Reference Systems, CRS, by Robert Hijmans
What are CRS? How can I extract the CRS information from a
SpatRaster
object?
Which CRS is used in the temp_eur
raster?
Reproject temp_eur
to a different CRS following
instructions on above-mentioned website
When is reprojection useful?