This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunks
We like R Markdown. In this file we step through some very simple mapping functions in R/Rmd.
First try using the sf
package’s built-in world
data to make a simple map plot.
It’s worth remembering that a map is just a plot with spatial arrangements…
head(world)
## Simple feature collection with 6 features and 10 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -180 ymin: -18.28799 xmax: 180 ymax: 83.23324
## Geodetic CRS: WGS 84
## # A tibble: 6 x 11
## iso_a2 name_long continent region_un subregion type area_km2 pop lifeExp
## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 FJ Fiji Oceania Oceania Melanesia Sove… 1.93e4 8.86e5 70.0
## 2 TZ Tanzania Africa Africa Eastern … Sove… 9.33e5 5.22e7 64.2
## 3 EH Western S… Africa Africa Northern… Inde… 9.63e4 NA NA
## 4 CA Canada North Am… Americas Northern… Sove… 1.00e7 3.55e7 82.0
## 5 US United St… North Am… Americas Northern… Coun… 9.51e6 3.19e8 78.8
## 6 KZ Kazakhstan Asia Asia Central … Sove… 2.73e6 1.73e7 71.6
## # … with 2 more variables: gdpPercap <dbl>, geom <MULTIPOLYGON [°]>
# plot variables 3 to 6
plot(world[3:6])
# plot world population
plot(world["pop"])
Now re-draw the population map using ggplot2
and the geom_sf
geometry… This will make a much prettier map because ggplot2
is so cool.
# using ggplot
library(ggplot2)
ggplot2::ggplot(world) +
geom_sf(aes(fill = pop/1000000)) +
scale_fill_continuous(name="Population (millions)",
low = "green",
high = "red")
That looks better…well, if we sorted out the colours and the visual dominance of 2 countries…
These are census areas - smaller than LSOAs so the files are bigger and the maps are denser.
As before we’re going to load some polygon boundary data (you need internet access for this) and some energy demand data so we can link them and map them.
# electricity consumption data at MSOA level (pre downloaded)
inFile <- here::here("data", "energy", "MSOA_Dom_Elec", "MSOA_DOM_ELEC_2019.csv")
msoa_elecData <- readr::read_csv(inFile)
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## `Local Authority Name` = col_character(),
## `Local Authority Code` = col_character(),
## `MSOA Name` = col_character(),
## `Middle Layer Super Output Area (MSOA) Code` = col_character(),
## `Number of meters` = col_double(),
## `Consumption (kWh)` = col_double(),
## `Mean consumption (kWh per meter)` = col_double(),
## `Median consumption (kWh per meter)` = col_double()
## )
This is electricity consumption data for 2019 for MSOAs. What variables have we got?
head(msoa_elecData)
## # A tibble: 6 x 8
## `Local Authority N… `Local Authority … `MSOA Name` `Middle Layer Super Outpu…
## <chr> <chr> <chr> <chr>
## 1 Hartlepool E06000001 Hartlepool … E02002483
## 2 Hartlepool E06000001 Hartlepool … E02002484
## 3 Hartlepool E06000001 Hartlepool … E02002485
## 4 Hartlepool E06000001 Hartlepool … E02002487
## 5 Hartlepool E06000001 Hartlepool … E02002488
## 6 Hartlepool E06000001 Hartlepool … E02002489
## # … with 4 more variables: Number of meters <dbl>, Consumption (kWh) <dbl>,
## # Mean consumption (kWh per meter) <dbl>,
## # Median consumption (kWh per meter) <dbl>
Clearly we are going to need to filter out the ones we don;t want…
Now load the MSOA boundary data for the Solent region only to keep things small & easy to play with.
We pre-downloaded these.
# las_to_load <- c("Southampton","Portsmouth","Winchester",
# "Eastleigh","Isle of Wight","Fareham",
# "Gosport","Test Valley","East Hampshire",
# "Havant","New Forest","Hart","Basingstoke and Deane")
# we pre-saved this data in the data folder of the repo for speed
# see getBoundaryData.R for how it works
inf <- here::here("data", "boundaries", "msoa_solent.shp") # use here to specify the data location
message("Loading MSOA geometry from ONS Open Geography API")
## Loading MSOA geometry from ONS Open Geography API
msoa_sf_data <- sf::read_sf(inf)
head(msoa_sf_data)
## Simple feature collection with 6 features and 13 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 458985.4 ymin: 102894 xmax: 467725.4 ymax: 107035
## Projected CRS: OSGB 1936 / British National Grid
## # A tibble: 6 x 14
## MSOA11CD MSOA11NM OBJECTID BNG_E BNG_N LONG_ LAT Shape_Leng Shape__Are
## <chr> <chr> <int> <int> <int> <dbl> <dbl> <dbl> <dbl>
## 1 E02003524 Portsmouth… 3510 463726 106331 -1.10 50.9 8175. 1880016.
## 2 E02003525 Portsmouth… 3511 465172 105904 -1.08 50.8 13220. 2012966.
## 3 E02003526 Portsmouth… 3512 466581 106095 -1.06 50.9 10425. 2260228.
## 4 E02003527 Portsmouth… 3513 464176 104420 -1.09 50.8 21957. 3838768.
## 5 E02003529 Portsmouth… 3514 466420 104925 -1.06 50.8 10636. 1499618.
## 6 E02003530 Portsmouth… 3515 465411 103721 -1.07 50.8 13264. 1695461.
## # … with 5 more variables: Shape__Len <dbl>, LAD11CD <chr>, LAD11NM <chr>,
## # nOAs <int>, geometry <MULTIPOLYGON [m]>
Which areas have we got and how many MSOAs are there in each?
table(msoa_sf_data$LAD11NM)
##
## Basingstoke and Deane East Hampshire Eastleigh
## 22 15 15
## Fareham Gosport Hart
## 14 10 11
## Havant Isle of Wight New Forest
## 17 18 23
## Portsmouth Southampton Test Valley
## 25 32 15
## Winchester
## 14
Now we’ll map/plot some of the data using the ggplot2
approach. To do that we need to merge the boundaries and the energy data so that we can fill
the boundaries with a colour according to one of the variables.
# create a variable with the LA code and the same name as in the sf_data
msoa_elecData$MSOA11CD <- msoa_elecData$`Middle Layer Super Output Area (MSOA) Code`
# merge them
msoa_merged_sf <- merge(msoa_sf_data, msoa_elecData)
# plot
ggplot2::ggplot(msoa_merged_sf) +
geom_sf(aes(fill = `Mean consumption (kWh per meter)`)) +
scale_fill_continuous(name = "Mean kWh per meter", low = "green", high = "red")