In this tutorial I am going to show how to use the third stage functions of nichetoolbox
. The work was done for GSOC 2016.
The third stage functions are related to model species niches and to estimate species distributions. To do the above I devolped methods to run algorithms that predict species niches and estimate species potential distributions (Elliposid models, bioclim and MaxEnt
). There are also some methods that convert the potential distribution map into a binary map which attempts to show where does the species is distributed. This last part includes methos to evaluate the species distribution maps.
Species Distribution Modeling (SDM) also known as Ecological Niche Modeling (ENM) is a growing field of ecology which aims to estimate the geographical distribution of the species. ENM uses a set of mathematical and statistical tools to study the relationship between some environmental variables and species occurrences to estimate species niches and predict potential areas where the species can survive. This models have proved to have a huge impact in ecology and conservation plans because they are used to find geographic localities that can be used to relocate endangered species, to study the impacts of climate chenge in biodiversity, to find biodiversity hotspots or in other context, localities that are vulnerable to invasive species and pathogens (Peterson 2003; Peterson & Vieglais 2001).
In nichetoolbox
you can model ecological niches by using one of following modeling algorithms:
Ellipsoid models use the multinormal probability density function (equation 1) to compute the habitat suitability index; the PDF is rescaled in order to have a suitability index defined in the inerval \([0,1]\).
\[f\,(x_{1},x_{2},x_{3},..,x_{k})=\frac{1}{\left(2\pi\right)^{k}\mid\mathbf{\sum}\mid}\exp\left(-\frac{1}{2}\left(\mathbf{x-\mathbf{\mathbf{\mu}}}\right)^{\mathbf{T}}\mathbf{\sum}^{-1}\left(\mathbf{x-\mathbf{\mathbf{\mu}}}\right)\right)\,\,(1)\]
\[f\,(x_{1},x_{2},x_{3},..,x_{k})=1\,\exp\left(-\frac{1}{2}\left(\mathbf{x-\mathbf{\mathbf{\mu}}}\right)^{\mathbf{T}}\mathbf{\sum}^{-1}\left(\mathbf{x-\mathbf{\mathbf{\mu}}}\right)\right)\]
where \(\mathbf{x}\) is the vector containing of the enviromental variables such that each \(x_i\) represents an observation of the environmental variable \(i\). \(\Sigma\) is the covariace matrix of the occ data. \(\mu\) is the vector of means (centroids).
The \(({\mathbf x}-{\boldsymbol\mu})^\mathrm{T}{\boldsymbol\Sigma}^{-1}({\mathbf x}-{\boldsymbol\mu})\) is the square of the Mahalanobis distance.
In nichetoolbox
to make an ellipsoid model you just neeed the environmental information of your ocurrence points and select which layers you are going to use to model the niche.
The model can be trained either by all ocurrence data or by the ocurrence points that lie inside your polygon of M.
Similary you can project the model by using either all raster extent or the extent of the polygon of M.
Select the your niche variables and run your model…
Download ellipsoid metadata
Download ellipsoid raster model
Download distance to the centroid table
The way that bioclim models are implemented in nichetoolbox
is just the same as ellipsoid models:
You can run MaxEnt
within nichetoolbox
. nichetoolbox
call the maxent function from dismo
package. In order to use MaxEnt
within nichetoolbox
you need to install rJava
and paste the .jar file of maxent
in the java folder of dismo. To test if maxent
is aviable run the following comand:
jar <- paste(system.file(package="dismo"), "/java/maxent.jar", sep='')
# Ask if necessary files are in java folder of dismo
file.exists(jar)
## [1] TRUE
# test if rJava is installed
"rJava" %in% installed.packages()
## [1] TRUE
If everithing its ok you can make maxent models within nichetoolbox
by using your own data or the data that you have downloaded from GBIF and by chossing between all raster extent layer or the M layers.
Most of MaxEnt
features and setting are implemented in the app
Once you have configured your maxent settings press the run button. A window with the basic statitics of maxent
will be displayed
To download maxent
results click on Download complete results link
Once you have modeled you species niche using one or all modeling algorithms, you can explore them in geographic space by using the model visualizer. The visualizer is interactive (you can zoom on map) and uses leaflet library.
The last part of the project deals with species distribution model evaluation and performance. nichetoolbox
has two ways to evaluate models:
ENMGadgets
package that does Partial Roc (Peterson et al. 2008).To do Partial ROC analysis on nichetoolbox
go up loas your continuos map model and your validation data.
The validation data must be in the following format:
sp_name | longitude | latitude |
---|---|---|
Ambystoma tigrinum | -107.08333 | 51.08333 |
Ambystoma tigrinum | -102.41667 | 44.41667 |
Ambystoma tigrinum | -99.75000 | 45.91667 |
Ambystoma tigrinum | -85.75000 | 45.25000 |
Ambystoma tigrinum | -91.75000 | 45.75000 |
Ambystoma tigrinum | -91.41667 | 39.75000 |
Binary maps section has functions to transfor continuos models into binary maps of presences and absences. The conversion can be done by using one of following methods: 1) Confusion matrix optimization: By using true presences and absences the algortihm search for the cut-off threshold that optimices the value of Kappa and/or TSS statistic. 2) Minimum training presence: Uses the lowest suitability value where a presences has occured as cut-off threshold. 3) User defined threshold: The user specifies the cut-off threshold.
The user upload both the continuos map (.asc) and the presences/anbesences data (.csv). The presences/anbesences data has to be in the following format
longitude | latitude | presence_absence |
---|---|---|
-111.25000 | 36.91667 | 0 |
-106.20000 | 35.30000 | 1 |
-98.08000 | 47.74000 | 1 |
-93.27306 | 45.21076 | 1 |
-112.64406 | 36.58329 | 1 |
-101.85097 | 35.18559 | 1 |
Once uploaded press specify the range of thresholds to look for and press Search threshold
button The output looks like this
Just upload your continuos model (.asc) and your training data (.csv).
The validation data must be in the following format:
sp_name | longitude | latitude | |
---|---|---|---|
85 | Ambystoma tigrinum | -100.58333 | 31.91667 |
86 | Ambystoma tigrinum | -91.08333 | 38.91667 |
87 | Ambystoma tigrinum | -113.41667 | 42.75000 |
88 | Ambystoma tigrinum | -121.41667 | 39.75000 |
89 | Ambystoma tigrinum | -114.58333 | 42.91667 |
90 | Ambystoma tigrinum | -94.41667 | 45.41667 |
Specify a cut-off threshold
Fielding AH and Bell F. (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24(1):38–49
Peterson AT, Vieglais DA (2001) Predicting species invasions using ecological niche modeling: New approaches from bioinformatics attack a pressing problem. Bioscience 51:363-371
Peterson AT (2003) Predicting the geography of species’ invasions via ecological niche modeling. Quarterly Review of Biology 78:419-433
Peterson AT, Papes M., Soberon J. (2008) Rethinking receiver operating characteristic analysis applications in ecological niche modeling. Ecological modeling 213:63–72
Peterson AT, Soberón J., Pearson R., Anderson R., Martínez-Meyer E., Nakamura M. & Araújo M. (2011) Ecological Niches and Geographic Distributions. Princeton University Press