<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Posts | ChrisScarpone</title><link>https://cscarpone.ca/post/</link><atom:link href="https://cscarpone.ca/post/index.xml" rel="self" type="application/rss+xml"/><description>Posts</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><image><url>https://cscarpone.ca/images/icon_hu6db2b7ecc6361704268603ce83d6353a_893485_512x512_fill_lanczos_center_2.png</url><title>Posts</title><link>https://cscarpone.ca/post/</link></image><item><title>Spatial Randomforest</title><link>https://cscarpone.ca/post/spatial-randomforest/</link><pubDate>Fri, 26 Mar 2021 00:00:00 +0000</pubDate><guid>https://cscarpone.ca/post/spatial-randomforest/</guid><description>
&lt;script src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/header-attrs/header-attrs.js">&lt;/script>
&lt;div id="how-to-run-a-rf-model" class="section level2">
&lt;h2>How to Run a RF Model&lt;/h2>
&lt;p>Although Random forest is a very powerful tool, it can be run fairly simply in R&lt;/p>
&lt;pre class="r">&lt;code>#Lets read our libraries in first then load our shape file
library(rgdal)
library(raster)
library(sp)
library(randomForest)
library(tidyverse)
#We are going to extract points from a raster to a shapefile
WorkWD &amp;lt;- &amp;quot;C:/Users/User/Documents/R/Projects/R_Intro&amp;quot;
Training &amp;lt;- readOGR(dsn = WorkWD, layer = &amp;quot;TrainingData&amp;quot;)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## OGR data source with driver: ESRI Shapefile
## Source: &amp;quot;C:\Users\User\Documents\R\Projects\R_Intro&amp;quot;, layer: &amp;quot;TrainingData&amp;quot;
## with 100 features
## It has 5 fields
## Integer64 fields read as strings: OBJECTID&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>plot(Training)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-1-1.png" width="672" />&lt;/p>
&lt;/div>
&lt;div id="coordinate-reference-system-and-spatial-projection" class="section level2">
&lt;h2>Coordinate Reference System and Spatial Projection&lt;/h2>
&lt;p>R is very picky about how we project our data, along with the geographical projection&lt;/p>
&lt;pre class="r">&lt;code>#Apply a consitent CRS- Coordinate Reference System
proj4string(Training) &amp;lt;- CRS(&amp;quot;+init=epsg:28992&amp;quot;)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in showSRID(uprojargs, format = &amp;quot;PROJ&amp;quot;, multiline = &amp;quot;NO&amp;quot;, prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in proj4string(obj): CRS object has comment, which is lost in output&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in `proj4string&amp;lt;-`(`*tmp*`, value = new(&amp;quot;CRS&amp;quot;, projargs = &amp;quot;+proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs&amp;quot;)): A new CRS was assigned to an object with an existing CRS:
## +proj=stere +lat_0=90 +lat_ts=52.1561605555556 +lon_0=5.38763888888889 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs
## without reprojecting.
## For reprojection, use function spTransform&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>#The geographic projection has to be changed as well
Training &amp;lt;- spTransform(Training , CRS(&amp;quot;+init=epsg:28992&amp;quot;))&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in showSRID(uprojargs, format = &amp;quot;PROJ&amp;quot;, multiline = &amp;quot;NO&amp;quot;, prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition&lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="raster-data" class="section level2">
&lt;h2>Raster Data&lt;/h2>
&lt;p>Lets load in our raster data and see what it looks like.&lt;/p>
&lt;pre class="r">&lt;code>#Load in the DEM and extract elevation values
#LOADINING THE RASTERS
DEM &amp;lt;- raster( file.path(WorkWD, &amp;quot;Elevationlow.tif&amp;quot;))
River &amp;lt;- raster(file.path(WorkWD, &amp;quot;Distance.tif&amp;quot;))
#lets plot the DEM to see what we have
plot(DEM)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-3-1.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>#Apply a consitent CRS
proj4string(DEM)&amp;lt;- CRS(&amp;quot;+init=epsg:28992&amp;quot;)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in showSRID(uprojargs, format = &amp;quot;PROJ&amp;quot;, multiline = &amp;quot;NO&amp;quot;, prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>proj4string(River)&amp;lt;- CRS(&amp;quot;+init=epsg:28992&amp;quot;)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in showSRID(uprojargs, format = &amp;quot;PROJ&amp;quot;, multiline = &amp;quot;NO&amp;quot;, prefer_proj =
## prefer_proj): Discarded datum Amersfoort in Proj4 definition&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>#Crop the River Raster to fit the DEM
#
River &amp;lt;- crop(River,DEM)&lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="raster-conversion-and-extract" class="section level2">
&lt;h2>Raster conversion and extract:&lt;/h2>
&lt;p>We can compute the slope and aspect as well The final function is the extract function, similar to the “extract multi values” tool in ArcGIS&lt;/p>
&lt;pre class="r">&lt;code>#We are also creating a slope and aspect layer
Slope &amp;lt;- terrain(DEM, opt=&amp;quot;slope&amp;quot;, unit=&amp;quot;degrees&amp;quot;, neighbors=8)
Aspect &amp;lt;- terrain(DEM, opt=&amp;quot;aspect&amp;quot;, unit=&amp;quot;degrees&amp;quot;, neighbors=8)
#Extract Elevation Values
#raster::extract is saying to use the extract function directly from our raster package
Training$DEM &amp;lt;- raster::extract(DEM, Training, method = &amp;quot;simple&amp;quot;)
Training$Slope &amp;lt;- raster::extract(Slope, Training, method = &amp;quot;simple&amp;quot;)
Training$Aspect &amp;lt;- raster::extract(Aspect, Training, method = &amp;quot;simple&amp;quot;)
Training$River &amp;lt;- raster::extract(River, Training, method = &amp;quot;simple&amp;quot;)
#Lets check to see the summary statistics of our newly extracted data
summary(Training)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Object of class SpatialPointsDataFrame
## Coordinates:
## min max
## coords.x1 178810 181390
## coords.x2 329714 333611
## Is projected: TRUE
## proj4string :
## [+proj=sterea +lat_0=52.1561605555556 +lon_0=5.38763888888889
## +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel +units=m +no_defs]
## Number of points: 100
## Data attributes:
## OBJECTID cadmium copper lead
## 1 : 1 Min. : 0.200 Min. : 14.00 Min. : 39.00
## 10 : 1 1st Qu.: 0.800 1st Qu.: 23.00 1st Qu.: 72.75
## 100 : 1 Median : 2.050 Median : 31.00 Median :118.00
## 11 : 1 Mean : 3.119 Mean : 39.57 Mean :152.87
## 12 : 1 3rd Qu.: 3.825 3rd Qu.: 48.25 3rd Qu.:204.75
## 13 : 1 Max. :18.100 Max. :117.00 Max. :654.00
## (Other):94
## zinc DEM Slope Aspect
## Min. : 113.0 Min. :3246 Min. : 2.872 Min. : 0.3393
## 1st Qu.: 191.2 1st Qu.:3540 1st Qu.:14.761 1st Qu.: 78.6693
## Median : 323.5 Median :3618 Median :24.352 Median :241.4956
## Mean : 461.1 Mean :3610 Mean :32.258 Mean :198.1060
## 3rd Qu.: 677.0 3rd Qu.:3702 3rd Qu.:46.017 3rd Qu.:304.2363
## Max. :1839.0 Max. :3892 Max. :76.916 Max. :358.6468
##
## River
## Min. :0.00000
## 1st Qu.:0.09215
## Median :0.24986
## Mean :0.25531
## 3rd Qu.:0.36813
## Max. :0.88039
## &lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="spatial-objects-to-data-frames" class="section level2">
&lt;h2>Spatial objects to Data frames&lt;/h2>
&lt;p>We need to convert our spatial object into a &lt;code>data.frame&lt;/code> if we want to conduct more analysis on our data&lt;/p>
&lt;pre class="r">&lt;code>#Spatial objects hold a dataframe, but they are treated like a spatial object, we need to convert the object to a dataframe in order to do additional analysis
Training.DF &amp;lt;- as.data.frame(Training)
#lets see the new structure of that DF
str(Training.DF)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## &amp;#39;data.frame&amp;#39;: 100 obs. of 11 variables:
## $ OBJECTID : Factor w/ 100 levels &amp;quot;1&amp;quot;,&amp;quot;10&amp;quot;,&amp;quot;100&amp;quot;,..: 1 13 24 35 46 57 68 79 90 2 ...
## $ cadmium : num 11.7 8.6 6.5 2.8 11.2 2.8 3 2.5 2.1 2 ...
## $ copper : num 85 81 68 29 93 48 61 31 32 27 ...
## $ lead : num 299 277 199 150 285 117 137 183 116 130 ...
## $ zinc : num 1022 1141 640 406 1096 ...
## $ DEM : num 3246 3328 3348 3482 3257 ...
## $ Slope : num 71.1 69.8 29.7 24.8 73.3 ...
## $ Aspect : num 316.9 314.76 59.79 9.43 300.36 ...
## $ River : num 0.00136 0.01222 0.10303 0.09215 0 ...
## $ coords.x1: num 181072 181025 181165 181027 180874 ...
## $ coords.x2: num 333611 333558 333537 333363 333339 ...&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>#drop the NA values so that we can safely run our RF
Training.DF &amp;lt;- Training.DF %&amp;gt;%
drop_na()&lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="running-random-forest" class="section level2">
&lt;h2>Running Random Forest&lt;/h2>
&lt;p>In order to run random forest, we need to use our &lt;code>data.frame&lt;/code> and decide which are the independant and dependent variables.&lt;/p>
&lt;pre class="r">&lt;code>set.seed(95)
MeuseRF&amp;lt;- randomForest(lead ~ DEM+Aspect+Slope+River, data=Training.DF, importance=TRUE, proximity=FALSE, varImpPlot = TRUE, varUsed = TRUE, TYPE=regression, ntree=1000)
#To see the r2 Value
MeuseRF&lt;/code>&lt;/pre>
&lt;pre>&lt;code>##
## Call:
## randomForest(formula = lead ~ DEM + Aspect + Slope + River, data = Training.DF, importance = TRUE, proximity = FALSE, varImpPlot = TRUE, varUsed = TRUE, TYPE = regression, ntree = 1000)
## Type of random forest: regression
## Number of trees: 1000
## No. of variables tried at each split: 1
##
## Mean of squared residuals: 8693.815
## % Var explained: 31.22&lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="plotting-the-rf-outputs" class="section level2">
&lt;h2>Plotting the RF outputs&lt;/h2>
&lt;p>The plots that we need are the var&lt;/p>
&lt;pre class="r">&lt;code>#to see the variable importance plots
varImpPlot(MeuseRF, main=&amp;quot;All Variables&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-7-1.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>partialPlot(MeuseRF, Training.DF, River, , main = &amp;quot;PPD River&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-7-2.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>partialPlot(MeuseRF, Training.DF, Slope, , main = &amp;quot;PPD Slope&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-7-3.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>partialPlot(MeuseRF, Training.DF, DEM, , main = &amp;quot;PPD Elevation&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-7-4.png" width="672" />&lt;/p>
&lt;pre class="r">&lt;code>#to predict the RF to an output, we have to create a raster stack of our data so that it cant determine the optimal placement of values&lt;/code>&lt;/pre>
&lt;/div>
&lt;div id="predicting-the-output" class="section level2">
&lt;h2>Predicting the output&lt;/h2>
&lt;p>to predict the RF to an output, we have to create a raster stack of our data so that it can determine the optimal placement of values&lt;/p>
&lt;pre class="r">&lt;code>#Creating the Raster Stack
Stack.Meuse &amp;lt;-stack(DEM,Slope,Aspect,River)
#The names of the Raster Stack and the Training Dataframe were not lining up
names(Stack.Meuse) &amp;lt;- c(&amp;quot;DEM&amp;quot;,&amp;quot;Slope&amp;quot;,&amp;quot;Aspect&amp;quot;,&amp;quot;River&amp;quot;)
Lead &amp;lt;- predict(Stack.Meuse, MeuseRF, filename = &amp;quot;lead.tif&amp;quot;, fun = predict, se.fit=TRUE, overwrite=TRUE)&lt;/code>&lt;/pre>
&lt;pre>&lt;code>## Warning in showSRID(uprojargs, format = &amp;quot;PROJ&amp;quot;, multiline = &amp;quot;NO&amp;quot;, prefer_proj
## = prefer_proj): Discarded datum Unknown based on Bessel 1841 ellipsoid in Proj4
## definition&lt;/code>&lt;/pre>
&lt;pre class="r">&lt;code>plot(Lead, main= &amp;quot;Lead for the Meuse Data&amp;quot;)&lt;/code>&lt;/pre>
&lt;p>&lt;img src="https://cscarpone.ca/post/spatial-randomforest/index.en_files/figure-html/unnamed-chunk-8-1.png" width="672" />&lt;/p>
&lt;/div></description></item></channel></rss>