1 Course folder structure and data

To follow this guide you should have a folder with data and scripts. You can find the scripts in the github repository of the tutorial. Clone the repository to your computer.

You should have 9 scripts:

  • 01_Presence_Data.R
  • 02_ProcessRasters.R
  • 03_OptimizePresenceData.R
  • 04_StaticalVariableSelection.R
  • 05_PrepareProjectionVars.R
  • 06_ModelBuiding.R
  • 07_ensembleModels.R
  • 08_projectAll.R
  • 09_prepareFinalMaps.R

Each chapter of this tutorial corresponds to a script listed above. Although the sequence of the scripts is important due to dependencies on data generated in previous steps, they are designed to work independently (such as in separate R sessions). At the beginning of each script, you will load the necessary data to proceed. If you follow the full tutorial sequentially in one session, you may find that some lines of code are redundant (e.g., like reopening the same library each time or reloading previously built models to make projections when they are already available in the same R session).

You should also have two folders:

  • data: where all the needed data is saved and where scripts are generating data
  • models: where model data from Biomod will be saved

NOTE: there should be some folders named original within the data. These will have the raw data used for the modelling scripts that is obtained along the tutorial. Do not change these folders.

The data folder should have data organised in three folders

  • data
    • other : Accessory data need for some scripts
      1. World countries shapefile from NaturalEarth
      2. Iberian Peninsula shapefile (processed from above file)
    • rasters: Here are the predictor variables to be used in the modelling process and where the processed rasters will be saved.
      • original: original raw data (obtained in chapter 2 and 5).
        • climate: will store 19 Bioclimatic variables for present and future predictions.
        • evi: Enhanced Vegetation Index obtained from the OpenGeoHub Foundation though OpenLandMap.
    • species: Species data as downloaded from GBIF and where processed presence data will be saved
      • original: Should have 3 zip files (one for each species of viper) as downloaded from GBIF and the extracted CSV text file with data and correctly named with species name (Vaspis, Vlatastei, Vseoanei).

The Bioclimatic climate variables are processed from monthly climate data (monthly minimum, mean and maximum temperatures and monthly precipitation) and coded as follows:

  • BIO1: Annual Mean Temperature
  • BIO2: Mean Diurnal Range (Mean of monthly (max temp - min temp))
  • BIO3: Isothermality (BIO2/BIO7) (×100)
  • BIO4: Temperature Seasonality (standard deviation ×100)
  • BIO5: Max Temperature of Warmest Month
  • BIO6: Min Temperature of Coldest Month
  • BIO7: Temperature Annual Range (BIO5-BIO6)
  • BIO8: Mean Temperature of Wettest Quarter
  • BIO9: Mean Temperature of Driest Quarter
  • BIO10: Mean Temperature of Warmest Quarter
  • BIO11: Mean Temperature of Coldest Quarter
  • BIO12: Annual Precipitation
  • BIO13: Precipitation of Wettest Month
  • BIO14: Precipitation of Driest Month
  • BIO15: Precipitation Seasonality (Coefficient of Variation)
  • BIO16: Precipitation of Wettest Quarter
  • BIO17: Precipitation of Driest Quarter
  • BIO18: Precipitation of Warmest Quarter
  • BIO19: Precipitation of Coldest Quarter

The EVI is an index derived from satellite imagery that reflects landscape productivity related to vegetation content by measuring the greenness. The original data is bi-monthly summarized (resulting in 1 variable each pair of month), and in this case, it has been further summarized into a single file representing the annual maximum value achieve for 2020.

Species data were collected from GBIF without any filtering. The doi of the data citation are:

All scripts start by setting the working directory. This is important because all paths in the scripts are relative to this directory. For example, if you placed your folder “Practical_EcoMod” on the desktop, the paths would be:

  • Windows: C:\Users\Peter\Desktop\Practical_EcoMod
  • Mac /Users/Peter/Desktop/Practical_EcoMod
  • Linux: /home/peter/Desktop/Practical_EcoMod

therefore the full path to the file data/other/iberia.shp would be:

  • Windows: C:\Users\Peter\Desktop\Practical_EcoMod\data\other\iberia.shp
  • Mac /Users/Peter/Desktop/Practical_EcoMod/data/other/iberia.shp
  • Linux: /home/peter/Desktop/Practical_EcoMod/data/other/iberia.shp

by setting the working directory as:

  • Windows (notice the use of double \\ ):
setwd("C:\\Users\\Peter\\Desktop\\Practical_EcoMod")
  • Mac :
setwd("/Users/Peter/Desktop/Practical_EcoMod")
  • Linux :
setwd("/home/peter/Desktop/Practical_EcoMod")

Once the working directory is set, all files can be referenced relative to this directory. For example, instead of providing the full path to the file, you can simply use the relative pathdata/other/iberia.shp for R to locate the file.

Each individual script begins by setting this working directory, but in this document, this step is omitted after being set once here.