| Title: | Spatial Datasets for Ecological Modeling |
| Version: | 1.0.0 |
| Description: | Provides spatial datasets ready to use for ecological modelling and raster companion data for prediction: Neanderthal presence during the Last Interglacial (Benito et al. 2017 <doi:10.1111/jbi.12845>); Plant diversity metrics for the World's Ecoregions (Maestre et al. 2021 <doi:10.1111/nph.17398>); tree richness across the Americas (Benito et al. 2013 <doi:10.1111/2041-210X.12022>); plant communities from the Sierra Nevada (Spain) with future climate scenarios (Benito et al. 2013 <doi:10.1111/2041-210X.12022>); butterfly-plant interaction data from Sierra Nevada (Spain) (Benito et al. 2011 <doi:10.1007/s10584-010-0015-3>); plant species occurrences in Andalusia (Spain) (Benito et al. 2014 <doi:10.1111/ddi.12148>); presence of the plant Linaria nigricans and greenhouses (Benito et al. 2009 <doi:10.1007/s10531-009-9604-8>); global NDVI and environmental predictors, and European oak species occurrences. All datasets include pre-processed environmental predictors ready for statistical modelling. |
| URL: | https://blasbenito.github.io/spatialData/ |
| BugReports: | https://github.com/BlasBenito/spatialData/issues |
| License: | CC BY 4.0 |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| Imports: | sf, terra |
| Suggests: | spelling, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| Language: | en-US |
| Depends: | R (≥ 4.1.0) |
| LazyData: | true |
| LazyDataCompression: | xz |
| NeedsCompilation: | no |
| Packaged: | 2026-04-18 19:19:55 UTC; blas |
| Author: | Blas M. Benito [aut, cre] |
| Maintainer: | Blas M. Benito <blasbenito@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-04-21 19:50:32 UTC |
Presence records of 90 plant species and background points from Andalusia, Spain
Description
sf long format data frame with POINT geometry and CRS ETRS89 / UTM zone 30N (EPSG:25830), containing 37,773 presence records for 90 plant species and 8,692 background points (46,465 rows total) from Andalusia, Spain.
The dataset contains 3 columns (species, presence, geometry). Environmental predictors for each point can be extracted from the companion raster returned by andalusia_extra(). Predictor names are stored in andalusia_predictors.
Usage
data(andalusia)
Format
An sf data frame with 46,465 rows (presences and background points) and 3 columns:
-
species: Character string (species name or"background"). Suitable for classification models. -
presence: Binary integer stored asinteger(1 = confirmed species presence, 0 = background point). -
geometry:sfc_POINTcolumn with coordinates in EPSG:25830.
Source
Published study:
Benito, B.M., Lorite, J., Pérez-Pérez, R., Gómez-Aparicio, L., & Peñas, J. (2014). Forecasting plant range collapse in a mediterranean hotspot: when dispersal uncertainties matter. Diversity and Distributions, 20(1), 72–83. doi:10.1111/ddi.12148
Landsat imagery:
Nunes de Lima, M. V. (Ed.) (2005). IMAGE2000 and CLC2000 - Products and methods. Joint Research Centre, Institute for Environment and Sustainability, and European Environment Agency. Publications Office of the European Union. https://op.europa.eu/en/publication-detail/-/publication/84dd2bad-14d9-4a65-9b92-3b4507d09e44/language-en
Climate variables:
Ninyerola, M., Pons, X. & Roure, J.M. (2005). Atlas Climático Digital de la Península Ibérica: Metodología y aplicaciones en bioclimatología y geobotánica. Universidad Autónoma de Barcelona, Bellaterra.
Topography:
Instituto Geográfico Nacional. Modelo Digital del Terreno (MDT25). https://www.ign.es
See Also
Other andalusia:
andalusia_extra(),
andalusia_predictors,
andalusia_responses
Examples
data(andalusia)
colnames(andalusia)
nrow(andalusia)
ncol(andalusia)
Environmental raster for the dataset andalusia
Description
Downloads and reads the 20-band environmental raster associated with the andalusia dataset from the spatialDataExtra repository. The raster covers Andalusia, Spain, at 400 m resolution (EPSG:25830) and includes remote-sensing, climate, and topographic predictors (see andalusia).
Usage
andalusia_extra()
Value
SpatRaster object with 20 layers.
See Also
Other andalusia:
andalusia,
andalusia_predictors,
andalusia_responses
Predictor names for the dataset andalusia
Description
Character vector of 20 predictor variable names corresponding to the layers of the environmental raster returned by andalusia_extra(), covering Landsat reflectance (7), rainfall (2), solar radiation (2), temperature (4), and topography (5). These are not columns in andalusia; use terra::extract() to attach them to the point data.
Usage
data(andalusia_predictors)
Format
Character vector of length 20.
See Also
Other andalusia:
andalusia,
andalusia_extra(),
andalusia_responses
Examples
data(andalusia_predictors)
andalusia_predictors
Response names for the dataset andalusia
Description
Character vector of length 2 containing the names of the
response variables in andalusia: "species" (character, species name or "background" for 90 species plus background points) and "presence" (binary integer, 1 = confirmed species presence, 0 = background point).
Usage
data(andalusia_responses)
Format
Character vector of length 2.
See Also
Other andalusia:
andalusia,
andalusia_extra(),
andalusia_predictors
Examples
data(andalusia_responses)
andalusia_responses
Plant Communities of Sierra Nevada (Spain)
Description
sf data frame with POINT geometry containing 6,747 plant community records from the Sierra Nevada mountain range (SE Spain), with 6 response variables (see communities_responses) and 9 numeric predictors (see communities_predictors).
Use communities_extra_2010(), communities_extra_2050(), and communities_extra_2100() to download the associated environmental rasters for the baseline (2010), 2050, and 2100 climate scenarios.
Usage
data(communities)
Format
An sf data frame with 6,747 rows and 16 columns:
Response variables (6):
-
community: Factor column with 6 levels:"none"(no presence of target communities),"Pyrenean oak forests","Juniper-broom shrublands","Pinus forests","Alpine pastures","Holm oak forests". -
pyrenean_oak: Binary integer presence-absence (1/0) for Pyrenean oak forests. -
juniper_shrubland: Binary integer presence-absence (1/0) for juniper-broom shrublands. -
pinus_forest: Binary integer presence-absence (1/0) for Pinus forests. -
alpine_pastures: Binary integer presence-absence (1/0) for alpine pastures. -
holm_oak: Binary integer presence-absence (1/0) for holm oak forests.
Predictor variables:
-
max_temperature_summer: Maximum summer temperature (degrees C). -
max_temperature_winter: Maximum winter temperature (degrees C). -
min_temperature_summer: Minimum summer temperature (degrees C). -
min_temperature_winter: Minimum winter temperature (degrees C). -
rainfall_summer: Summer rainfall (mm). -
rainfall_winter: Winter rainfall (mm). -
northness: Northness index (cosine of aspect, -1 to 1). -
slope: Terrain slope (degrees). -
topographic_wetness_index: Topographic wetness index.
Geometry:
-
geometry: Point geometry (ETRS89 / UTM zone 30N, EPSG:25830).
Source
Benito, B., Lorite, J., & Peñas, J. (2011). Simulating potential effects of climatic warming on altitudinal patterns of key species in Mediterranean-alpine ecosystems. Climatic Change, 108, 471–483. doi:10.1007/s10584-010-0015-3
See Also
Other communities:
communities_extra_2010(),
communities_extra_2050(),
communities_extra_2100(),
communities_predictors,
communities_responses
Examples
data(communities)
colnames(communities)
nrow(communities)
ncol(communities)
Download Environmental Raster for communities - 2010
Description
Downloads the baseline (2010) environmental raster associated with the communities dataset from the spatialDataExtra repository. Writes the file communities_2010.tif in the working directory and returns it as a spatRaster object.
Usage
communities_extra_2010()
Value
spatRaster object.
See Also
Other communities:
communities,
communities_extra_2050(),
communities_extra_2100(),
communities_predictors,
communities_responses
Download Environmental Raster for communities - 2050
Description
Downloads the future climate (2050) raster associated with the communities dataset from the spatialDataExtra repository. Writes the file communities_2050.tif in the working directory and returns it as a spatRaster object.
Usage
communities_extra_2050()
Value
SpatRaster object.
See Also
Other communities:
communities,
communities_extra_2010(),
communities_extra_2100(),
communities_predictors,
communities_responses
Download Environmental Raster for communities - 2100
Description
Downloads the future climate (2100) raster associated with the communities dataset from the spatialDataExtra repository. Writes the file communities_2100.tif in the working directory and returns it as a spatRaster object.
Usage
communities_extra_2100()
Value
SpatRaster object.
See Also
Other communities:
communities,
communities_extra_2010(),
communities_extra_2050(),
communities_predictors,
communities_responses
Predictor variable names for the dataset communities
Description
Character vector of 9 predictor variable names from communities.
Usage
data(communities_predictors)
Format
A character vector of length 9.
See Also
Other communities:
communities,
communities_extra_2010(),
communities_extra_2050(),
communities_extra_2100(),
communities_responses
Examples
data(communities_predictors)
communities_predictors
Response variable names for the dataset communities
Description
Character vector of 6 response variable names from communities.
Usage
data(communities_responses)
Format
A character vector of length 6.
See Also
Other communities:
communities,
communities_extra_2010(),
communities_extra_2050(),
communities_extra_2100(),
communities_predictors
Examples
data(communities_responses)
communities_responses
Butterfly and host plant presence in Sierra Nevada (SE Spain)
Description
sf dataframe with co-occurrence records for a butterfly and its host plant in Sierra Nevada (SE Spain). Contains 3 response variables (see interaction_responses) and 10 numeric predictors at 100 m resolution (see interaction_predictors). Use interaction_extra() to download the associated environmental raster.
Usage
data(interaction)
Format
An sf dataframe with 1,000 rows (presence and background points) and 14 columns:
Response variables (3):
-
butterfly: Integer with three possible values: 1 (presence of Agriades zullichi), 0 (background), and NA (host plant observation site, where butterfly was not surveyed). -
host_plant: Integer with three possible values: 1 (presence of Androsace vitaliana), 0 (background), and NA (butterfly observation site, where the plant was not surveyed). -
class: Factor with three levels:"butterfly","host_plant", and"background", indicating the observation type of each record.
Predictor variables:
-
landsat_ndvi: Normalized Difference Vegetation Index derived from Landsat imagery. -
landsat_pca_bands_123: First principal component of Landsat bands 1, 2, and 3 (visible). -
landsat_pca_bands_457: First principal component of Landsat bands 4, 5, and 7 (infrared). -
rainfall_annual: Mean annual rainfall (mm). -
solar_radiation: Mean annual solar radiation (MJ m^{-2}day^{-1}). -
temperature_annual_mean: Mean annual temperature (degrees C). -
temperature_summer_max: Maximum summer temperature (degrees C). -
temperature_winter_min: Minimum winter temperature (degrees C). -
topographic_complexity: Index of terrain ruggedness and heterogeneity. -
topographic_position: Relative elevation of a cell compared to its surroundings.
Geometry:
-
geometry: Point geometry (ETRS89 / UTM zone 30N, EPSG:25830).
Source
Species occurrences:
Barea-Azcón, J.M., Benito, B.M., Olivares, F.J., Ruiz, H., Martín, J., García, A.L., & López, R. (2014). Distribution and conservation of the relict interaction between the butterfly Agriades zullichi and its larval foodplant (Androsace vitaliana nevadensis). Biodiversity and Conservation, 23(4), 927–944. doi:10.1007/s10531-014-0643-4
Remote sensing data:
Nunes de Lima, M. V. (Ed.) (2005). IMAGE2000 and CLC2000 – Products and methods. Joint Research Centre, Institute for Environment and Sustainability, and European Environment Agency. Publications Office of the European Union. https://op.europa.eu/en/publication-detail/-/publication/84dd2bad-14d9-4a65-9b92-3b4507d09e44/language-en
Climate and topographic variables:
Benito, B., Lorite, J., & Peñas, J. (2011). Simulating potential effects of climatic warming on altitudinal patterns of key species in Mediterranean-alpine ecosystems. Climatic Change, 108, 471–483. doi:10.1007/s10584-010-0015-3
See Also
Other interaction:
interaction_extra(),
interaction_predictors,
interaction_responses
Examples
data(interaction)
colnames(interaction)
nrow(interaction)
ncol(interaction)
Download Environmental Raster for interaction
Description
Downloads and reads the environmental raster associated with the interaction dataset from the spatialDataExtra repository. Requires the terra package. Writes the file sierra_nevada_env.tif to the working directory and returns a spatRaster object.
Usage
interaction_extra()
Value
SpatRaster object.
See Also
Other interaction:
interaction(),
interaction_predictors,
interaction_responses
Predictor variable names for interaction dataset
Description
Character vector of 10 predictor variable names from interaction.
Usage
data(interaction_predictors)
Format
A character vector of length 10.
See Also
Other interaction:
interaction(),
interaction_extra(),
interaction_responses
Examples
data(interaction_predictors)
interaction_predictors
Response variable names for the dataset interaction
Description
Character vector of 3 response variable names from interaction.
Usage
data(interaction_responses)
Format
A character vector of length 3.
See Also
Other interaction:
interaction(),
interaction_extra(),
interaction_predictors
Examples
data(interaction_responses)
interaction_responses
Presence of Linaria nigricans and greenhouses in Eastern Andalusia
Description
sf data frame with POINT geometry containing presence records of the plant
Linaria nigricans, greenhouses, and background points from Eastern Andalusia (Spain). The dataframe contains 2 response variables (see linaria_responses), and 20 numeric predictors (see linaria_predictors). Use linaria_extra() to download the associated environmental raster.
The dataset combines species presence records, greenhouse presence records (representing a competing land use), and randomly sampled background points. Species presences and greenhouse presences were spatially thinned at 400 m to remove redundancy at the raster resolution. Background points were randomly sampled within the extent of the presence records. Environmental predictors were extracted from a Landsat/DEM-derived raster at 400 m resolution (EPSG:25830).
Usage
data(linaria)
Format
An sf data frame with 7386 rows (presences and background points) and 25 columns:
Response variables:
-
linaria_nigricans: Binary integer (1 = confirmed Linaria nigricans presence, 0 = greenhouse presence or background point). -
greenhouses: Binary integer (1 = greenhouse presence, 0 = Linaria nigricans presence or background point).
Predictor variables:
-
landsat_band_1: Landsat TM Band 1 — Blue (0.45–0.52 µm), surface reflectance. -
landsat_band_2: Landsat TM Band 2 — Green (0.52–0.60 µm), surface reflectance. -
landsat_band_3: Landsat TM Band 3 — Red (0.63–0.69 µm), surface reflectance. -
landsat_band_4: Landsat TM Band 4 — Near-infrared (0.76–0.90 µm), surface reflectance. -
landsat_band_5: Landsat TM Band 5 — Short-wave infrared 1 (1.55–1.75 µm), surface reflectance. -
landsat_band_6: Landsat TM Band 6 — Thermal infrared (10.4–12.5 µm), brightness temperature (K). -
landsat_ndvi: Normalized Difference Vegetation Index derived from Landsat bands 3 and 4. -
rainfall_annual: Total annual rainfall (mm). -
rainfall_summer: Total summer rainfall (mm, June–September). -
solar_radiation_summer: Mean daily solar radiation in summer (kJ m-2 day-1). -
solar_radiation_winter: Mean daily solar radiation in winter (kJ m-2 day-1). -
temperature_summer_max: Mean maximum temperature in summer (degrees C). -
temperature_summer_min: Mean minimum temperature in summer (degrees C). -
temperature_winter_max: Mean maximum temperature in winter (degrees C). -
temperature_winter_min: Mean minimum temperature in winter (degrees C). -
topography_eastness: Eastward component of aspect (sin of aspect in radians). -
topography_elevation: Elevation above sea level (m). -
topography_northness: Northward component of aspect (cos of aspect in radians). -
topography_position: Topographic position index (local elevation relative to neighbourhood mean). -
topography_slope: Slope gradient (degrees).
Geometry:
-
geometry: Point geometry (ETRS89 / UTM zone 30N, EPSG:25830).
Source
Published studies:
Benito, B.M., Martínez-Ortega, M.M., Munoz, L.M., Lorite, J. & Penas, J. (2009). Assessing extinction-risk of endangered plants using species distribution models: a case study of habitat depletion caused by the spread of greenhouses. Biodiversity and Conservation, 18(9), 2509–2520. doi:10.1007/s10531-009-9604-8
Peñas, J., Benito, B., Lorite, J., et al. (2011). Habitat fragmentation in arid zones: a case study of Linaria nigricans under land use changes (SE Spain). Environmental Management, 48, 168–176. doi:10.1007/s00267-011-9663-y
Landsat imagery:
Nunes de Lima, M. V. (Ed.) (2005). IMAGE2000 and CLC2000 – Products and methods. Joint Research Centre, Institute for Environment and Sustainability, and European Environment Agency. Publications Office of the European Union. https://op.europa.eu/en/publication-detail/-/publication/84dd2bad-14d9-4a65-9b92-3b4507d09e44/language-en
Climate variables:
Ninyerola, M., Pons, X. & Roure, J.M. (2005). Atlas Climático Digital de la Península Ibérica: Metodología y aplicaciones en bioclimatología y geobotánica. Universidad Autónoma de Barcelona, Bellaterra.
Topography:
Instituto Geográfico Nacional. Modelo Digital del Terreno (MDT25). https://www.ign.es
See Also
Other linaria:
linaria_extra(),
linaria_predictors,
linaria_responses
Examples
data(linaria)
colnames(linaria)
nrow(linaria)
ncol(linaria)
Download Environmental Raster for linaria
Description
Downloads and reads the 20-band environmental raster associated with the linaria dataset
from the spatialDataExtra repository. Writes the file linaria_env.tif in the working directory and returns it as a spatRaster object.
Usage
linaria_extra()
Value
SpatRaster object with 20 layers.
See Also
Other linaria:
linaria,
linaria_predictors,
linaria_responses
Predictor variable names for the dataset linaria
Description
Character vector of 20 predictor variable names from linaria, covering Landsat reflectance (7), rainfall (2), solar radiation (2), temperature (4), and topography (5).
Usage
data(linaria_predictors)
Format
A character vector of length 20.
See Also
Other linaria:
linaria,
linaria_extra(),
linaria_responses
Examples
data(linaria_predictors)
linaria_predictors
Response variable names for the dataset linaria
Description
Character vector of length 2 containing the names of the response variables in linaria.
Usage
data(linaria_responses)
Format
A character vector of length 2.
See Also
Other linaria:
linaria,
linaria_extra(),
linaria_predictors
Examples
data(linaria_responses)
linaria_responses
Neanderthal presence in the Last Interglacial
Description
sf data frame with POINT geometry containing 227 records of Neanderthal presence from Marine Isotope Stage 5e (Last Interglacial) in Europe and the Near East, 1 response variable (see neanderthal_response), and 25 predictors (see neanderthal_predictors). Use neanderthal_extra() to download the associated environmental raster.
Usage
data(neanderthal)
Format
An sf data frame with 227 rows (presence and pseudo-absence sites) and 27 columns:
Response variable (1):
-
presence: Binary integer (1 = Neanderthal presence site, 0 = pseudo-absence site).
Predictor variables:
Bioclimatic variables derived from a Last Interglacial GCM simulation (Otto-Bliesner et al. 2006), downscaled following the method of Hijmans et al. (2005). These are analogous to the standard WorldClim bioclimatic variables but represent Last Interglacial (MIS 5e) conditions rather than modern climate:
-
bio1: Annual mean temperature (degrees C). -
bio2: Mean diurnal range (degrees C). -
bio3: Isothermality (bio2/bio7 * 100). -
bio4: Temperature seasonality (standard deviation * 100). -
bio5: Max temperature of warmest month (degrees C). -
bio6: Min temperature of coldest month (degrees C). -
bio7: Temperature annual range (degrees C). -
bio8: Mean temperature of wettest quarter (degrees C). -
bio9: Mean temperature of driest quarter (degrees C). -
bio10: Mean temperature of warmest quarter (degrees C). -
bio11: Mean temperature of coldest quarter (degrees C). -
bio12: Annual precipitation (mm). -
bio13: Precipitation of wettest month (mm). -
bio14: Precipitation of driest month (mm). -
bio15: Precipitation seasonality (coefficient of variation). -
bio16: Precipitation of wettest quarter (mm). -
bio17: Precipitation of driest quarter (mm). -
bio18: Precipitation of warmest quarter (mm). -
bio19: Precipitation of coldest quarter (mm). -
topo_aspect: Aspect in degrees. -
topo_diversity_local: Local topographic diversity. -
topo_diversity: Regional topographic diversity. -
topo_elev: Elevation in meters. -
topo_slope: Slope in degrees. -
topo_wetness: Topographic wetness index.
Geometry:
-
geometry: Point geometry (WGS84, EPSG:4326).
Source
Presence data:
Benito, B.M., et al. (2017). The ecological niche and distribution of Neanderthals during the Last Interglacial. Journal of Biogeography, 44, 51-61. doi:10.1111/jbi.12845
Nielsen, T.K., Benito, B.M., Svenning, J.-C., Sandel, B., McKerracher, L., Riede, F., & Kjærgaard, P.C. (2017). Investigating Neanderthal dispersal above 55°N in Europe during the Last Interglacial Complex. Quaternary International, 431, 88-103. doi:10.1016/j.quaint.2015.10.039
Palaeoclimatic variables (GCM simulation):
Otto-Bliesner, B.L., Marshall, S.J., Overpeck, J.T., Miller, G.H. & Hu, A. (2006). Simulating arctic climate warmth and icefield retreat in the last interglaciation. Science, 311, 1751-1753.
Palaeoclimatic variables (interpolation):
Hijmans, R.J., Cameron, S.E., Parra, J.L., Jones, P.G. & Jarvis, A. (2005). Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology, 25, 1965-1978.
Elevation and topography:
Jarvis, A., Guevara, E., Reuter, H. I., & Nelson, A. D. (2008). Hole-filled SRTM for the globe: version 4, data grid. Web publication/site, CGIAR Consortium for Spatial Information. https://srtm.csi.cgiar.org
See Also
Other neanderthal:
neanderthal_extra(),
neanderthal_predictors,
neanderthal_response
Examples
data(neanderthal)
colnames(neanderthal)
nrow(neanderthal)
ncol(neanderthal)
Download Environmental Raster for neanderthal
Description
Downloads and reads the environmental raster associated with the neanderthal dataset from the spatialDataExtra repository. Writes the file neanderthal_env.tif to the working directory and returns it as a spatRaster object.
Usage
neanderthal_extra()
Value
SpatRaster object.
See Also
Other neanderthal:
neanderthal,
neanderthal_predictors,
neanderthal_response
Predictor variable names for the dataset neanderthal
Description
Character vector of 25 predictor variable names from neanderthal.
Usage
data(neanderthal_predictors)
Format
A character vector of length 25.
See Also
Other neanderthal:
neanderthal,
neanderthal_extra(),
neanderthal_response
Examples
data(neanderthal_predictors)
neanderthal_predictors
Response variable name for the dataset neanderthal
Description
Character string with the name of the response variable in neanderthal.
Usage
data(neanderthal_response)
Format
A character string of length 1.
See Also
Other neanderthal:
neanderthal,
neanderthal_extra(),
neanderthal_predictors
Examples
data(neanderthal_response)
neanderthal_response
Plant diversity metrics for the World's Ecoregions
Description
Plant diversity metrics (richness, rarity, beta diversity) obtained from GBIF Plantae records for the World's Ecoregions. Includes metrics for all plants, trees, and grasses at species, genus, and family taxonomic levels. Ecoregion boundaries are derived from Ecoregions 2017. Original polygon geometries have been converted to point centroids to reduce file size while preserving spatial context. Use plantae_extra() to download a full version with original polygon geometries. The datasets plantae_west and plantae_east are subsets of plantae focused on overall plant richness of the western and easter hemispheres, respectively.
The GBIF download comprised 244,830,168 records from 4,741 datasets, filtered to records with coordinates, no geospatial issues, and occurrence status "present".
Tree species were identified by cross-referencing GBIF records with the BGCI Global Tree Search database (BGCI 2020). Grasses were defined as members of family Poaceae.
Rarity-weighted richness was computed for each taxon as the inverse of its number of spatial presence records in GBIF, then scores are summed per ecoregion, while mean rarity is the mean of these inverse presence record counts per taxon in an ecoregion.
Beta diversity was computed between each ecoregion and its immediate neighboring ecoregions via Sorensen dissimilarity (Bsor = 1 - 2a/(2a+b+c)) and Simpson dissimilarity (Bsim = min(b,c)/(min(b,c)+a)), following Koleff et al. (2003).
Fragmentation metrics were computed with the R package landscapemetrics
(Hesselbarth et al. 2019) at 5 km resolution in Lambert Azimuthal Equal-Area projection.
Climate hypervolume was computed using hypervolume::hypervolume_svm() from the climate predictors.
Aridity is computed as 1 minus the aridity index of Trabucco and Zomer (2019), so maximum aridity is coded as 1.
Environmental predictors were extracted as mean pixel values per ecoregion from rasters at 1 km resolution.
Usage
data(plantae)
Format
An sf data frame with 662 rows (ecoregions) and 143 columns:
Identifier columns:
-
ecoregion_id: Unique ecoregion identifier. -
ecoregion_name: Ecoregion name. -
ecoregion_biome: Biome classification. -
ecoregion_realm: Biogeographic realm. -
ecoregion_continent: Continent name.
Response variables - Richness (9):
-
richness_species: Number of plant species. -
richness_genera: Number of plant genera. -
richness_families: Number of plant families. -
richness_classes: Number of plant classes. -
richness_species_trees: Number of tree species. -
richness_genera_trees: Number of tree genera. -
richness_families_trees: Number of tree families. -
richness_species_grasses: Number of grass species. -
richness_genera_grasses: Number of grass genera.
Response variables - Rarity-weighted richness (6):
-
rarity_weighted_richness_species: Rarity-weighted richness for species (sum of inverse spatial presence record counts per taxon; Williams et al. 1996). -
rarity_weighted_richness_genera: Rarity-weighted richness for genera (sum of inverse spatial presence record counts per taxon). -
rarity_weighted_richness_species_trees: Rarity-weighted richness for tree species (sum of inverse spatial presence record counts per taxon). -
rarity_weighted_richness_genera_trees: Rarity-weighted richness for tree genera (sum of inverse spatial presence record counts per taxon). -
rarity_weighted_richness_species_grasses: Rarity-weighted richness for grass species (sum of inverse spatial presence record counts per taxon). -
rarity_weighted_richness_genera_grasses: Rarity-weighted richness for grass genera (sum of inverse spatial presence record counts per taxon).
Response variables - Mean rarity (6):
-
mean_rarity_species: Mean rarity index for species (mean of inverse spatial presence record counts per taxon). -
mean_rarity_genera: Mean rarity index for genera (mean of inverse spatial presence record counts per taxon). -
mean_rarity_species_trees: Mean rarity index for tree species (mean of inverse spatial presence record counts per taxon). -
mean_rarity_genera_trees: Mean rarity index for tree genera (mean of inverse spatial presence record counts per taxon). -
mean_rarity_species_grasses: Mean rarity index for grass species (mean of inverse spatial presence record counts per taxon). -
mean_rarity_genera_grasses: Mean rarity index for grass genera (mean of inverse spatial presence record counts per taxon).
Response variables - Beta diversity R (absolute richness difference) (16):
-
betadiversity_R_species: Absolute richness difference between ecoregion and neighbors for species. -
betadiversity_R_percent_species: Absolute richness difference as percentage for species. -
betadiversity_R_genera: Absolute richness difference between ecoregion and neighbors for genera. -
betadiversity_R_percent_genera: Absolute richness difference as percentage for genera. -
betadiversity_R_families: Absolute richness difference between ecoregion and neighbors for families. -
betadiversity_R_percent_families: Absolute richness difference as percentage for families. -
betadiversity_R_species_trees: Absolute richness difference between ecoregion and neighbors for tree species. -
betadiversity_R_percent_species_trees: Absolute richness difference as percentage for tree species. -
betadiversity_R_genera_trees: Absolute richness difference between ecoregion and neighbors for tree genera. -
betadiversity_R_percent_genera_trees: Absolute richness difference as percentage for tree genera. -
betadiversity_R_families_trees: Absolute richness difference between ecoregion and neighbors for tree families. -
betadiversity_R_percent_families_trees: Absolute richness difference as percentage for tree families. -
betadiversity_R_species_grasses: Absolute richness difference between ecoregion and neighbors for grass species. -
betadiversity_R_percent_species_grasses: Absolute richness difference as percentage for grass species. -
betadiversity_R_genera_grasses: Absolute richness difference between ecoregion and neighbors for grass genera. -
betadiversity_R_percent_genera_grasses: Absolute richness difference as percentage for grass genera.
Response variables - Beta diversity Sorensen (8):
-
betadiversity_sorensen_species: Sorensen dissimilarity for species (Bsor = 1 - 2a/(2a+b+c); Koleff et al. 2003). -
betadiversity_sorensen_genera: Sorensen dissimilarity for genera (Bsor = 1 - 2a/(2a+b+c)). -
betadiversity_sorensen_families: Sorensen dissimilarity for families (Bsor = 1 - 2a/(2a+b+c)). -
betadiversity_sorensen_species_trees: Sorensen dissimilarity for tree species (Bsor = 1 - 2a/(2a+b+c)). -
betadiversity_sorensen_genera_trees: Sorensen dissimilarity for tree genera (Bsor = 1 - 2a/(2a+b+c)). -
betadiversity_sorensen_families_trees: Sorensen dissimilarity for tree families (Bsor = 1 - 2a/(2a+b+c)). -
betadiversity_sorensen_species_grasses: Sorensen dissimilarity for grass species (Bsor = 1 - 2a/(2a+b+c)). -
betadiversity_sorensen_genera_grasses: Sorensen dissimilarity for grass genera (Bsor = 1 - 2a/(2a+b+c)).
Response variables - Beta diversity Simpson (8):
-
betadiversity_simpson_species: Simpson dissimilarity for species (Bsim = min(b,c)/(min(b,c)+a); Koleff et al. 2003). -
betadiversity_simpson_genera: Simpson dissimilarity for genera (Bsim = min(b,c)/(min(b,c)+a)). -
betadiversity_simpson_families: Simpson dissimilarity for families (Bsim = min(b,c)/(min(b,c)+a)). -
betadiversity_simpson_species_trees: Simpson dissimilarity for tree species (Bsim = min(b,c)/(min(b,c)+a)). -
betadiversity_simpson_genera_trees: Simpson dissimilarity for tree genera (Bsim = min(b,c)/(min(b,c)+a)). -
betadiversity_simpson_families_trees: Simpson dissimilarity for tree families (Bsim = min(b,c)/(min(b,c)+a)). -
betadiversity_simpson_species_grasses: Simpson dissimilarity for grass species (Bsim = min(b,c)/(min(b,c)+a)). -
betadiversity_simpson_genera_grasses: Simpson dissimilarity for grass genera (Bsim = min(b,c)/(min(b,c)+a)).
Predictor variables:
-
bias_log_records: Logarithm of the total GBIF records in ecoregion. -
geo_neighbors_count: Number of neighboring ecoregions. -
geo_neighbors_area_km2: Total area of neighboring ecoregions in square kilometers. -
geo_neighbors_aridity_mean: Mean aridity of neighboring ecoregions. -
geo_area_km2: Ecoregion area in square kilometers. -
geo_polygons_count: Number of polygons in multipolygon geometry. -
geo_perimeter_km: Ecoregion perimeter in kilometers. -
geo_shared_perimeter_km: Shared perimeter with neighbors in kilometers. -
geo_shared_perimeter_fraction: Fraction of perimeter shared with neighbors. -
geo_distance_to_ocean: Distance to nearest ocean in kilometers. -
geo_elevation_mean: Mean elevation in meters. -
human_population: Total human population in ecoregion. -
human_population_density: Human population density per square kilometer. -
human_footprint_mean: Mean human footprint index. -
climate_velocity_lgm_mean: Mean climate velocity since Last Glacial Maximum. -
climate_hypervolume: Climate hypervolume (niche space size), computed withhypervolume::hypervolume_svm(). -
air_humidity_max: Maximum near-surface relative humidity (%). -
air_humidity_mean: Mean near-surface relative humidity (%). -
air_humidity_min: Minimum near-surface relative humidity (%). -
air_humidity_range: Near-surface relative humidity range (%). -
aridity_mean: Mean aridity (1 minus aridity index; higher values indicate greater aridity). -
cloud_cover_max: Maximum cloud cover (%). -
cloud_cover_mean: Mean cloud cover (%). -
cloud_cover_min: Minimum cloud cover (%). -
cloud_cover_range: Cloud cover range (%). -
evapotranspiration_max: Maximum potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_mean: Mean potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_min: Minimum potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_range: Potential evapotranspiration range (kg m-2 month-1; Penman-Monteith). -
precipitation_seasonality: Precipitation seasonality (coefficient of variation of monthly estimates; CHELSA bio15). -
precipitation_total: Total annual precipitation (kg m-2 year-1; CHELSA bio12). -
precipitation_coldest_quarter: Precipitation of coldest quarter (kg m-2; CHELSA bio19). -
precipitation_driest_month: Precipitation of driest month (kg m-2; CHELSA bio14). -
precipitation_driest_quarter: Precipitation of driest quarter (kg m-2; CHELSA bio17). -
precipitation_warmest_quarter: Precipitation of warmest quarter (kg m-2; CHELSA bio18). -
precipitation_wettest_month: Precipitation of wettest month (kg m-2; CHELSA bio13). -
precipitation_wettest_quarter: Precipitation of wettest quarter (kg m-2; CHELSA bio16). -
temperature_isothermality: Isothermality: ratio of diurnal to annual temperature variation (degrees C; CHELSA bio3). -
temperature_mean_daily_range: Mean diurnal temperature range (degrees C; CHELSA bio2). -
temperature_mean: Mean annual temperature (degrees C; CHELSA bio1). -
temperature_range: Annual temperature range (degrees C; CHELSA bio7). -
temperature_seasonality: Temperature seasonality as standard deviation of monthly means (degrees C; CHELSA bio4). -
temperature_coldest_month: Minimum temperature of coldest month (degrees C; CHELSA bio6). -
temperature_coldest_quarter: Mean temperature of coldest quarter (degrees C; CHELSA bio11). -
temperature_driest_quarter: Mean temperature of driest quarter (degrees C; CHELSA bio9). -
temperature_warmest_month: Maximum temperature of warmest month (degrees C; CHELSA bio5). -
temperature_warmest_quarter: Mean temperature of warmest quarter (degrees C; CHELSA bio10). -
temperature_wettest_quarter: Mean temperature of wettest quarter (degrees C; CHELSA bio8). -
landcover_bare_percent_mean: Mean percentage of bare ground. -
landcover_herbs_percent_mean: Mean percentage of herbaceous vegetation. -
landcover_trees_percent_mean: Mean percentage of tree cover. -
fragmentation_ai: Aggregation index. -
fragmentation_area_mn: Mean patch area. -
fragmentation_ca: Total class area. -
fragmentation_clumpy: Clumpiness index. -
fragmentation_cohesion: Patch cohesion index. -
fragmentation_contig_mn: Mean contiguity index. -
fragmentation_core_mn: Mean core area. -
fragmentation_cpland: Core area percentage of landscape. -
fragmentation_dcore_mn: Mean disjunct core area. -
fragmentation_division: Landscape division index. -
fragmentation_ed: Edge density. -
fragmentation_lsi: Landscape shape index. -
fragmentation_mesh: Effective mesh size. -
fragmentation_ndca: Number of disjunct core areas. -
fragmentation_nlsi: Normalized landscape shape index. -
fragmentation_np: Number of patches. -
fragmentation_shape_mn: Mean shape index. -
fragmentation_tca: Total core area. -
fragmentation_te: Total edge. -
soil_clay: Soil clay content (%). -
soil_nitrogen: Soil nitrogen content (%). -
soil_organic_carbon: Soil organic carbon content (%). -
soil_ph: Soil pH. -
soil_sand: Soil sand content (%). -
soil_silt: Soil silt content (%). -
soil_temperature_max: Maximum soil temperature (degrees C). -
soil_temperature_mean: Mean soil temperature (degrees C). -
soil_temperature_min: Minimum soil temperature (degrees C). -
soil_temperature_range: Soil temperature range (degrees C). -
ndvi_max: Maximum NDVI (1999-2019). -
ndvi_mean: Mean NDVI (1999-2019). -
ndvi_min: Minimum NDVI (1999-2019). -
ndvi_range: NDVI range (1999-2019).
Geometry:
-
geometry: Ecoregion centroids, POINT geometry (WGS84, EPSG:4326).
Source
Associated publications:
Maestre, F.T., Benito, B.M., Berdugo, M., Concostrina-Zubiri, L., Delgado-Baquerizo, M., Eldridge, D.J., Guirado, E., Gross, N., Kefi, S., Le Bagousse-Pinguet, Y., et al. (2021). Biogeography of global drylands. New Phytologist, 231(2), 540–558. doi:10.1111/nph.17398
GBIF Plantae Dataset (September 15, 2020). doi:10.15468/dl.xh5y5g
Dinerstein, E., et al. (2017). An Ecoregion-Based Approach to Protecting Half the Terrestrial Realm. BioScience, 67(6), 534-545. doi:10.1093/biosci/bix014
Karger, D.N., et al. (2021). Climatologies at high resolution for the earth's land surface areas. EnviDat. doi:10.16904/envidat.228.v2.1
Hengl, T., et al. (2017). SoilGrids250m: Global gridded soil information based on machine learning. PLOS ONE, 12(2), e0169748. doi:10.1371/journal.pone.0169748
Lembrechts, J.J., et al. (2021). Mismatches between soil and air temperature. Global Change Biology. doi:10.1111/gcb.16060
Copernicus Global Land Service: NDVI Long Term Statistics v3 (1999-2019). https://land.copernicus.eu/en/products/vegetation
Buchhorn, M., et al. (2020). Copernicus Global Land Service: Land Cover 100m: collection 3: epoch 2019: Globe. Zenodo. doi:10.5281/zenodo.3939050
CGIAR-CSI SRTM 90m Digital Elevation Database. https://srtm.csi.cgiar.org/
Trabucco, A. & Zomer, R.J. (2019). Global Aridity Index and Potential Evapotranspiration Climate Database v2. CGIAR-CSI. doi:10.6084/m9.figshare.7504448.v3
BGCI (2020). GlobalTreeSearch online database. Botanic Gardens Conservation International. https://tools.bgci.org/global_tree_search.php
Hesselbarth, M.H.K., et al. (2019). landscapemetrics: an open-source R tool to calculate landscape metrics. Ecography, 42(10), 1648-1657. doi:10.1111/ecog.04617
Koleff, P., Gaston, K.J. & Lennon, J.J. (2003). Measuring beta diversity for presence-absence data. Journal of Animal Ecology, 72(3), 367-382. doi:10.1046/j.1365-2656.2003.00710.x
Williams, P.H., et al. (1996). A comparison of richness hotspots, rarity hotspots, and complementary areas for conserving diversity of British birds. Conservation Biology, 10(1), 155-174.
Venter, O., et al. (2016). Global terrestrial Human Footprint maps for 1993 and 2009. Scientific Data, 3, 160067. doi:10.1038/sdata.2016.67
See Also
Other plantae:
plantae_east,
plantae_extra(),
plantae_predictors,
plantae_responses,
plantae_west
Examples
data(plantae)
colnames(plantae)
nrow(plantae)
ncol(plantae)
Eastern Hemisphere subset of plantae
Description
Subset of the plantae dataset filtered to non-American ecoregions (ecoregion_continent != "Americas") with richness_species (overall plant species richness) as the only response variable. All 84 predictor variables and identifier columns in plantae are retained.
Usage
data(plantae_east)
Format
An sf data frame with 434 rows and 91 columns.
See Also
Other plantae:
plantae,
plantae_extra(),
plantae_predictors,
plantae_responses,
plantae_west
Examples
data(plantae_east)
colnames(plantae_east)
nrow(plantae_east)
ncol(plantae_east)
Download Extended plantae Dataset
Description
Downloads and reads the extended version of the plantae dataset with original polygon geometries instead of point centroids, from the spatialDataExtra repository. Writes the file plantae.gpkg to the working directory and returns it as an sf dataframe.
See plantae for details on the response variables, predictors, and data sources.
Usage
plantae_extra()
Value
sf dataframe with 662 rows and 143 columns (MULTIPOLYGON geometry, WGS84).
See Also
Other plantae:
plantae,
plantae_east,
plantae_predictors,
plantae_responses,
plantae_west
Predictor variable names for the dataset plantae
Description
Character vector of 84 predictor variable names from plantae.
Usage
data(plantae_predictors)
Format
A character vector of length 84.
See Also
Other plantae:
plantae,
plantae_east,
plantae_extra(),
plantae_responses,
plantae_west
Examples
data(plantae_predictors)
plantae_predictors
Response variable names for the dataset plantae
Description
Character vector containing the names of the 53 response variables in plantae.
Usage
data(plantae_responses)
Format
A character vector of length 53.
See Also
Other plantae:
plantae,
plantae_east,
plantae_extra(),
plantae_predictors,
plantae_west
Examples
data(plantae_responses)
plantae_responses
Western Hemisphere subset of plantae
Description
Subset of the plantae dataset filtered to American ecoregions (ecoregion_continent == "Americas") with richness_species (overall plant species richness) as the only response variable. All 84 predictor variables and identifier columns in plantae are retained.
Usage
data(plantae_west)
Format
An sf data frame with 228 rows and 91 columns.
See Also
Other plantae:
plantae,
plantae_east,
plantae_extra(),
plantae_predictors,
plantae_responses
Examples
data(plantae_west)
colnames(plantae_west)
nrow(plantae_west)
ncol(plantae_west)
European Quercus (Oak) Species Distribution with Environmental Predictors
Description
sf data frame with POINT geometry containing 6,728 records of eight European
Quercus (oak) species and absence points, 1 response variable (see quercus_response),
and 31 numeric predictors (see quercus_predictors).
Use quercus_extra() to download the associated environmental raster.
Usage
data(quercus)
Format
An sf data frame with 6728 rows (species occurrences and absences) and 33 columns:
Response variable:
-
species: Character column with 9 levels:"absence"(background absence points),"Quercus robur"(English oak),"Quercus petraea"(Sessile oak),"Quercus ilex"(Holm oak),"Quercus cerris"(Turkey oak),"Quercus faginea"(Portuguese oak),"Quercus pubescens"(Downy oak),"Quercus pyrenaica"(Pyrenean oak),"Quercus suber"(Cork oak).
Predictor variables:
WorldClim v2 bioclimatic variables (excludes bio8 and bio9):
-
bio1: Annual mean temperature (degrees C). -
bio2: Mean diurnal range (degrees C). -
bio3: Isothermality (bio2/bio7 * 100). -
bio4: Temperature seasonality (standard deviation * 100). -
bio5: Max temperature of warmest month (degrees C). -
bio6: Min temperature of coldest month (degrees C). -
bio7: Temperature annual range (degrees C). -
bio10: Mean temperature of warmest quarter (degrees C). -
bio11: Mean temperature of coldest quarter (degrees C). -
bio12: Annual precipitation (mm). -
bio13: Precipitation of wettest month (mm). -
bio14: Precipitation of driest month (mm). -
bio15: Precipitation seasonality (coefficient of variation). -
bio16: Precipitation of wettest quarter (mm). -
bio17: Precipitation of driest quarter (mm). -
bio18: Precipitation of warmest quarter (mm). -
bio19: Precipitation of coldest quarter (mm). -
ndvi_average: Average NDVI. -
ndvi_maximum: Maximum NDVI. -
ndvi_minimum: Minimum NDVI. -
ndvi_range: NDVI range. -
sun_rad_average: Average solar radiation (kJ m-2 day-1). -
sun_rad_maximum: Maximum solar radiation (kJ m-2 day-1). -
sun_rad_minimum: Minimum solar radiation (kJ m-2 day-1). -
sun_rad_range: Solar radiation range (kJ m-2 day-1). -
landcover_veg_bare: Percentage of bare ground. -
landcover_veg_herb: Percentage of herbaceous vegetation. -
landcover_veg_tree: Percentage of tree cover. -
topographic_diversity: Number of unique combinations of elevation, slope, and aspect classes within a neighborhood. -
topo_slope: Topographic slope in degrees. -
human_footprint: Human footprint index.
Geometry:
-
geometry: Point geometry (WGS84, EPSG:4326).
Source
Species occurrences:
GBIF.org. Global Biodiversity Information Facility. https://www.gbif.org/
Bioclimatic variables and solar radiation:
Fick, S.E. & Hijmans, R.J. (2017). WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas. International Journal of Climatology, 37(12), 4302-4315. doi:10.1002/joc.5086
NDVI:
Didan, K. (2015). MOD13A2 MODIS/Terra Vegetation Indices 16-Day L3 Global 1km SIN Grid V006. NASA EOSDIS LP DAAC. doi:10.5067/MODIS/MOD13A2.006
Land cover:
DiMiceli, C., et al. (2015). MOD44B MODIS/Terra Vegetation Continuous Fields Yearly L3 Global 250m SIN Grid V006. NASA EOSDIS LP DAAC. doi:10.5067/MODIS/MOD44B.006
Elevation and topography:
Jarvis, A., Guevara, E., Reuter, H. I., & Nelson, A. D. (2008). Hole-filled SRTM for the globe: version 4, data grid. Web publication/site, CGIAR Consortium for Spatial Information. https://srtm.csi.cgiar.org
Human footprint:
Venter, O., et al. (2016). Global terrestrial Human Footprint maps for 1993 and 2009. Scientific Data, 3, 160067. doi:10.1038/sdata.2016.67
See Also
Other quercus:
quercus_extra(),
quercus_predictors,
quercus_response
Examples
data(quercus)
colnames(quercus)
nrow(quercus)
ncol(quercus)
Download Environmental Raster for quercus
Description
Downloads and reads the environmental raster associated with the quercus dataset from the spatialDataExtra repository. Writes the file quercus_env.tif to the working directory, and returns it as a spatRaster object.
Usage
quercus_extra()
Value
SpatRaster object.
See Also
Other quercus:
quercus,
quercus_predictors,
quercus_response
Predictor variable names for for the dataset quercus
Description
Character vector of 31 predictor variable names from quercus.
Usage
data(quercus_predictors)
Format
A character vector of length 31.
See Also
Other quercus:
quercus,
quercus_extra(),
quercus_response
Examples
data(quercus_predictors)
quercus_predictors
Response variable name for the dataset quercus
Description
Character string with the name of the response variable in quercus.
Usage
data(quercus_response)
Format
A character string of length 1.
See Also
Other quercus:
quercus,
quercus_extra(),
quercus_predictors
Examples
data(quercus_response)
quercus_response
Mesoamerican tree species richness
Description
sf data frame with POLYGON geometry representing 3,373 hexagonal grid cells across the Americas, with 1 response variable encoding tree species richness and 50 numeric environmental predictors.
Tree species in this dataset does NOT represent total tree species counts! The dataset focuses on the tree species found in Mesoamerica according to the Tree Biodiversity Network (BIOTREE-NET; Cayuela et al. 2012). These tree species were later used as input for a search query at the Global Biodiversity Information Facility (GBIF). The resulting presence data and environmental data at 1km resolution were aggregated as a hexagonal grid.
The hexagonal grid was constructed using sf::st_make_grid(..., cellsize = 1, square = FALSE) at 1-degree resolution (WGS84, EPSG:4326), covering longitudes -125.3° to -34.3° and latitudes -34.4° to 49.9°.
Usage
data(trees)
Format
An sf data frame with 3373 rows (hexagonal cells) and 53 columns:
Identifier (1):
-
cellid: Integer row number identifying each hexagonal cell.
Response variable (1):
-
trees: Integer count of tree species richness per hexagonal cell.
Predictor variables:
-
air_humidity_max: Maximum monthly near-surface relative humidity (%). -
air_humidity: Mean annual near-surface relative humidity (%). -
air_humidity_min: Minimum monthly near-surface relative humidity (%). -
air_humidity_range: Annual near-surface relative humidity range (%). -
aridity: Mean aridity index (unitless ratio; higher values indicate wetter conditions). -
cloud_cover_max: Maximum monthly total cloud cover (%). -
cloud_cover: Mean annual total cloud cover (%). -
cloud_cover_min: Minimum monthly total cloud cover (%). -
cloud_cover_range: Annual total cloud cover range (%). -
evapotranspiration_max: Maximum monthly potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration: Mean annual potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_min: Minimum monthly potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_range: Annual potential evapotranspiration range (kg m-2 month-1; Penman-Monteith). -
rainfall_seasonality: Precipitation seasonality as coefficient of variation of monthly totals (CHELSA bio15). -
rainfall: Total annual precipitation (kg m-2; CHELSA bio12). -
rainfall_coldest_quarter: Precipitation of coldest quarter (kg m-2; CHELSA bio19). -
rainfall_driest_month: Precipitation of driest month (kg m-2; CHELSA bio14). -
rainfall_driest_quarter: Precipitation of driest quarter (kg m-2; CHELSA bio17). -
rainfall_warmest_quarter: Precipitation of warmest quarter (kg m-2; CHELSA bio18). -
rainfall_wettest_month: Precipitation of wettest month (kg m-2; CHELSA bio13). -
rainfall_wettest_quarter: Precipitation of wettest quarter (kg m-2; CHELSA bio16). -
temperature_isothermality: Isothermality as ratio of mean daily range to annual range (unitless; CHELSA bio3). -
temperature_mean_daily_range: Mean of monthly temperature ranges (degrees C; CHELSA bio2). -
temperature: Mean annual air temperature (degrees C; CHELSA bio1). -
temperature_range: Annual air temperature range (degrees C; CHELSA bio7). -
temperature_seasonality: Temperature seasonality as standard deviation of monthly means (degrees C; CHELSA bio4). -
temperature_coldest_month_min: Minimum temperature of coldest month (degrees C; CHELSA bio6). -
temperature_coldest_quarter: Mean temperature of coldest quarter (degrees C; CHELSA bio11). -
temperature_driest_quarter: Mean temperature of driest quarter (degrees C; CHELSA bio9). -
temperature_warmest_month_max: Maximum temperature of warmest month (degrees C; CHELSA bio5). -
temperature_warmest_quarter: Mean temperature of warmest quarter (degrees C; CHELSA bio10). -
temperature_wettest_quarter: Mean temperature of wettest quarter (degrees C; CHELSA bio8). -
distance_to_ocean: Distance to nearest ocean coastline (km). -
elevation: Elevation above sea level (m). -
latitude: Latitude of cell centroid (degrees). -
longitude: Longitude of cell centroid (degrees). -
soil_clay: Soil clay content (%). -
soil_nitrogen: Soil nitrogen content (g kg-1). -
soil_organic_carbon: Soil organic carbon content (g kg-1). -
soil_ph: Soil pH in water. -
soil_sand: Soil sand content (%). -
soil_silt: Soil silt content (%). -
soil_temperature_max: Maximum annual land surface temperature (degrees C). -
soil_temperature: Mean annual land surface temperature (degrees C). -
soil_temperature_min: Minimum annual land surface temperature (degrees C). -
soil_temperature_range: Annual land surface temperature range (degrees C). -
ndvi_max: Maximum annual NDVI (unitless, 0-1). -
ndvi: Mean annual NDVI (unitless, 0-1). -
ndvi_min: Minimum annual NDVI (unitless, 0-1). -
ndvi_range: Annual NDVI range (unitless, 0-1).
Geometry:
-
geometry: Hexagonal polygon geometry (WGS84, EPSG:4326).
Source
Dataset publication:
Benito, B.M., Cayuela, L., & Albuquerque, F.S. (2013). The impact of modelling choices in the predictive performance of richness maps derived from species-distribution models: Guidelines to build better diversity models. Methods in Ecology and Evolution, 4(4), 327–335. doi:10.1111/2041-210X.12022
Response variable (tree species richness):
Cayuela, L., Gálvez-Bravo, L., Pérez Pérez, R., de Albuquerque, F.S., Golicher, D.J., Zahawi, R.A., et al. (2012). The Tree Biodiversity Network (BIOTREE-NET): prospects for biodiversity research and conservation in the Neotropics. Biodiversity & Ecology, 4, 211–224. doi:10.7809/b-e.00078
GBIF: Global Biodiversity Information Facility. https://www.gbif.org
Climate predictors (temperature, precipitation, air humidity, cloud cover, evapotranspiration):
Brun, P., Zimmermann, N.E., Hari, C., Pellissier, L., & Karger, D.N. (2022). CHELSA-BIOCLIM+ A novel set of global climate-related predictors at kilometre-resolution. EnviDat. doi:10.16904/envidat.332
Aridity:
Zomer, R.J., Xu, J., & Trabucco, A. (2022). Version 3 of the Global Aridity Index and Potential Evapotranspiration Database. Scientific Data, 9, 409. doi:10.1038/s41597-022-01493-1
Soil properties:
Hengl, T., et al. (2017). SoilGrids250m: Global gridded soil information based on machine learning. PLOS ONE, 12(2), e0169748. doi:10.1371/journal.pone.0169748
Soil temperature:
Wan, Z., Hook, S., & Hulley, G. (2015). MOD11A2 MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V006. NASA EOSDIS LP DAAC. doi:10.5067/MODIS/MOD11A2.006
NDVI:
Copernicus Land Monitoring Service. (2019). Normalised Difference Vegetation Index Statistics (Long Term 1999-2019), raster 1 km, global, version 3. European Commission, Joint Research Centre. doi:10.2909/290e81fb-4c84-42ad-ae12-f663312b0eda
Elevation and geography:
Jarvis, A., Guevara, E., Reuter, H. I., & Nelson, A. D. (2008). Hole-filled SRTM for the globe: version 4, data grid. Web publication/site, CGIAR Consortium for Spatial Information. https://srtm.csi.cgiar.org
See Also
Other trees:
trees_extra(),
trees_predictors,
trees_response
Examples
data(trees)
colnames(trees)
nrow(trees)
ncol(trees)
Download Presence Records for trees
Description
Downloads and reads an sf dataframe with the tree species presence records associated with the trees dataset from the spatialDataExtra repository. Writes the file trees_presence.gpkg to the working folder, and returns it as an sf dataframe.
Usage
trees_extra()
Value
sf data frame with POINT geometry (WGS84, EPSG:4326) and columns species and source.
See Also
Other trees:
trees,
trees_predictors,
trees_response
Predictor variable names for the dataset trees
Description
Character vector of 50 predictor variable names from trees.
Usage
data(trees_predictors)
Format
A character vector of length 50.
See Also
Other trees:
trees,
trees_extra(),
trees_response
Examples
data(trees_predictors)
trees_predictors
Response variable name for the dataset trees
Description
Character vector of length 1 containing the name of the response variable in trees.
Usage
data(trees_response)
Format
A character vector of length 1.
See Also
Other trees:
trees,
trees_extra(),
trees_predictors
Examples
data(trees_response)
trees_response
Global long-term NDVI records and environmental predictors
Description
sf data frame with POINT geometry representing 9,265 global locations with one response variable represented in five different encodings of the long-term average (1999-2019) of the Normalized Difference Vegetation Index (NDVI) and 58 environmental predictors (47 numeric, 11 categorical). Use vi_extra() to download an extended version with 30,000 rows. There is a smaller version of this dataset (580 rows) named vi_smol
NDVI values are derived from the Copernicus Global Land Service Long Term Statistics product (1999-2019) at 1 km resolution. Locations were spatially thinned to reduce spatial autocorrelation.
Environmental predictors were extracted as pixel values from normalized raster data at 1 km resolution.
Usage
data(vi)
Format
An sf data frame with 9265 rows (locations) and 64 columns:
Response variables (5):
-
vi_numeric: Continuous NDVI value (0-1). -
vi_counts: Integer count encoding of NDVI (vi_numeric * 1000). -
vi_binomial: Binary encoding of NDVI (1 if vi_numeric > 0.5, else 0). -
vi_categorical: Categorical encoding of NDVI ("very_low", "low", "medium", "high", "very_high"). -
vi_factor: Factor encoding of NDVI (vi_categorical as factor).
Predictor variables:
-
koppen_zone: Koppen climate zone code (Beck et al. 2018). -
koppen_group: Koppen climate group name. -
koppen_description: Koppen climate description. -
soil_type: Soil classification type. -
topo_slope: Topographic slope in degrees. -
topo_diversity: Number of combinations of different elevations, slopes, and aspects in a 5 km radius around each 1 km cell. -
topo_elevation: Elevation in meters. -
swi_mean: Mean annual soil water index (unitless, 0-100 cm depth). -
swi_max: Maximum annual soil water index (unitless, 0-100 cm depth). -
swi_min: Minimum annual soil water index (unitless, 0-100 cm depth). -
swi_range: Annual soil water index range (unitless, 0-100 cm depth). -
soil_temperature_mean: Mean annual land surface temperature (degrees C). -
soil_temperature_max: Maximum annual land surface temperature (degrees C). -
soil_temperature_min: Minimum annual land surface temperature (degrees C). -
soil_temperature_range: Annual land surface temperature range (degrees C). -
soil_sand: Soil sand content (%). -
soil_clay: Soil clay content (%). -
soil_silt: Soil silt content (%). -
soil_ph: Soil pH. -
soil_soc: Soil organic carbon content (%). -
soil_nitrogen: Soil nitrogen content (%). -
solar_rad_mean: Mean annual solar radiation (kJ m-2). -
solar_rad_max: Maximum annual solar radiation (kJ m-2). -
solar_rad_min: Minimum annual solar radiation (kJ m-2). -
solar_rad_range: Annual solar radiation range (kJ m-2). -
growing_season_length: Length of the growing season (days). -
growing_season_temperature: Mean temperature of the growing season (degrees C). -
growing_season_rainfall: Accumulated precipitation of the growing season (kg m-2). -
growing_degree_days: Growing degree days above 0 degrees C accumulated over one year (degree-days). -
temperature_mean: Mean annual air temperature (degrees C; CHELSA bio1). -
temperature_max: Maximum temperature of warmest month (degrees C; CHELSA bio5). -
temperature_min: Minimum temperature of coldest month (degrees C; CHELSA bio6). -
temperature_range: Annual air temperature range (degrees C; CHELSA bio7). -
temperature_seasonality: Temperature seasonality as standard deviation of monthly means (degrees C; CHELSA bio4). -
rainfall_mean: Mean annual rainfall (kg m-2). -
rainfall_min: Minimum monthly rainfall (kg m-2). -
rainfall_max: Maximum monthly rainfall (kg m-2). -
rainfall_range: Annual rainfall range (kg m-2). -
evapotranspiration_mean: Mean annual potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_max: Maximum monthly potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_min: Minimum monthly potential evapotranspiration (kg m-2 month-1; Penman-Monteith). -
evapotranspiration_range: Annual potential evapotranspiration range (kg m-2 month-1; Penman-Monteith). -
cloud_cover_mean: Mean annual total cloud cover (%). -
cloud_cover_max: Maximum monthly total cloud cover (%). -
cloud_cover_min: Minimum monthly total cloud cover (%). -
cloud_cover_range: Annual total cloud cover range (%). -
aridity_index: Mean aridity index (unitless ratio; higher values indicate wetter conditions). -
humidity_mean: Mean annual near-surface relative humidity (%). -
humidity_max: Maximum monthly near-surface relative humidity (%). -
humidity_min: Minimum monthly near-surface relative humidity (%). -
humidity_range: Annual near-surface relative humidity range (%). -
biogeo_ecoregion: Ecoregion name. -
biogeo_biome: Biome name. -
biogeo_realm: Ecological realm name. -
country_name: Country name. -
continent: Continent name. -
region: UN region name. -
subregion: UN sub-region name.
Geometry:
-
geometry: Point geometry (WGS84, EPSG:4326).
Source
Response variables (NDVI):
Copernicus Land Monitoring Service. (2019). Normalised Difference Vegetation Index Statistics (Long Term 1999-2019), raster 1 km, global, version 3. European Commission, Joint Research Centre. doi:10.2909/290e81fb-4c84-42ad-ae12-f663312b0eda
Climate classification:
Beck, H.E., et al. (2018). Present and future Koppen-Geiger climate classification maps at 1-km resolution. Scientific Data, 5, 180214. doi:10.1038/sdata.2018.214
Soil water index:
Copernicus Land Monitoring Service: Soil Water Index. doi:10.2909/290e81fb-4c84-42ad-ae12-f663312b0eda
Climate predictors (temperature, rainfall, solar radiation, growing season, evapotranspiration, cloud cover, humidity):
Brun, P., Zimmermann, N.E., Hari, C., Pellissier, L., & Karger, D.N. (2022). CHELSA-BIOCLIM+ A novel set of global climate-related predictors at kilometre-resolution. EnviDat. doi:10.16904/envidat.332
Soil type and properties:
Hengl, T., et al. (2017). SoilGrids250m: Global gridded soil information based on machine learning. PLOS ONE, 12(2), e0169748. doi:10.1371/journal.pone.0169748
Soil temperature:
Wan, Z., Hook, S., & Hulley, G. (2015). MOD11A2 MODIS/Terra Land Surface Temperature/Emissivity 8-Day L3 Global 1km SIN Grid V006. NASA EOSDIS LP DAAC. doi:10.5067/MODIS/MOD11A2.006
Ecoregions and biogeography:
Dinerstein, E., et al. (2017). An Ecoregion-Based Approach to Protecting Half the Terrestrial Realm. BioScience, 67(6), 534-545. doi:10.1093/biosci/bix014
Elevation and topography:
Jarvis, A., Guevara, E., Reuter, H. I., & Nelson, A. D. (2008). Hole-filled SRTM for the globe: version 4, data grid. Web publication/site, CGIAR Consortium for Spatial Information. https://srtm.csi.cgiar.org
Aridity index:
Zomer, R.J., Xu, J., & Trabucco, A. (2022). Version 3 of the Global Aridity Index and Potential Evapotranspiration Database. Scientific Data, 9, 409. doi:10.1038/s41597-022-01493-1
Country, continent, region, and subregion:
Natural Earth. Free vector and raster map data. https://www.naturalearthdata.com/
See Also
Other vi:
vi_extra(),
vi_predictors,
vi_responses,
vi_smol
Examples
data(vi)
colnames(vi)
nrow(vi)
ncol(vi)
Download extended vi dataset
Description
Downloads and reads the extended version of the vi dataset (30,000 rows) from the spatialDataExtra repository. Writes the file vi.gpkg to the working directory, and returns it as an sf dataframe. See vi for details on the response variables, predictors, and data sources.
Usage
vi_extra()
Value
sf data.frame with 30,000 rows and 64 columns (POINT geometry, WGS84).
See Also
Other vi:
vi,
vi_predictors,
vi_responses,
vi_smol
Predictor variable names for the dataset vi
Description
Character vector of 58 predictor variable names from vi.
Usage
data(vi_predictors)
Format
A character vector of length 58.
See Also
Other vi:
vi,
vi_extra(),
vi_responses,
vi_smol
Examples
data(vi_predictors)
vi_predictors
Response variable names for the dataset vi
Description
Character vector containing the names of the 5 response variables in vi.
Usage
data(vi_responses)
Format
A character vector of length 5.
See Also
Other vi:
vi,
vi_extra(),
vi_predictors,
vi_smol
Examples
data(vi_responses)
vi_responses
Small version of vi
Description
Same as dataset vi, but with only 580 rows.
Usage
data(vi_smol)
Format
A data frame with 580 rows and 65 columns.
See Also
Other vi:
vi,
vi_extra(),
vi_predictors,
vi_responses
Examples
data(vi_smol)
colnames(vi_smol)
nrow(vi_smol)
ncol(vi_smol)