2  Get data from GBIF

To get biodiversity data from the Global Biodiversity Information Facility (GBIF), you can use the rgbif package. This package allows you to search for species occurrence data, get species information, and download data from GBIF.

But before diving into the rgbif package, you’ll first need to create an account on the GBIF website to authorize your queries and download. This will also allow you to store a query history and manage your data downloads, including citations.

All data retrieved from GBIF need to be cited properly. You can find the citation information on your downloads page on the GBIF website.

Usually, downloads that are not cited within a publication within 6 months of its creation are eligible for deletion. Read more about GBIF’s deletion policy here.

2.0.1 Install the rgbif package and configure credentials

Now that you have a GBIF account, you can install and load the rgbif package:

install.packages("rgbif")
library(rgbif)

With the package installed, you can configure your R session to use your GBIF account credentials. This will allow you to access your GBIF account and download data directly from R. You will need to supply your GBIF username, password and email. To do that, you should store your credentials into your .Renviron file. This file is located in your home directory and is used to store environment variables that R can access.

Warning

NEVER store your credentials in shared files or scripts. Always use environment variables to store sensitive information.

To add your GBIF credentials to your .Renviron file, you can use the following code:

install.packages("usethis")
usethis::edit_r_environ()

Then, edit your .Renviron file to include your GBIF credentials:

GBIF_USER="Your GBIF username"
GBIF_PASSWORD="Your GBIF password"
GBIF_EMAIL="Your GBIF email"

2.1 Starting a query

To start a query in R, you can use the function occ_search. This function allows you to search for species occurrence data in GBIF. You can specify the species name, country, and other parameters to filter the data you want to retrieve.

For example, to search for occurrences of the species Ursus arctos (Brown Bear) in Canada, you can use the following code:

# Search for occurrences of Ursus arctos in Canada
occ_search(scientificName = "Ursus arctos", country = "CA")

To limit the geographic extent of the search, you can use the geometry parameter to specify a bounding box or a polygon. For example, to search for occurrences of Ursus arctos in a bounding box around the city of Vancouver, you can use the following code:

# Define the bounding box points for Quebec
# The order of the points is important: lower left, upper left, upper right, lower right, lower left
# POLYGON((A_longitude A_latitude, B_longitude B_latitude, C_longitude C_latitude, D_longitude D_latitude, A_longitude A_latitude))

quebec_bbox <- "POLYGON((-79.767 44.999, -57.071 44.999, -57.071 62.591, -79.767 62.591, -79.767 44.999))"


# Search for occurrences of Ursus arctos in the bounding box around Quebec

occ_search(scientificName = "Ursus arctos", geometry = quebec_bbox)

Once you have designed your query, you can then use the rgbif function occ_download to start a download request:

data_download <- occ_download(type = "and",
pred_in("scientificName", "Ursus arctos"),
pred_in("country", "CA"))

This will create a download request for your GBIF account.

Creating your query on GBIF

Alternatively, you can log in to your GBIF account and create your query on the GBIF website. There, you can specify the parameters of your query and download the data directly from the website, or get the download key to download the data using the rgbif package and the occ_download_get function (recomended), as shown below.

2.2 Downloading the data

Once your download request is ready, you can download the dataset using the occ_download_get function:

occ_dataset <- 
occ_download_get('YOUR_KEY_HERE') |> 
    occ_download_import()

Now you’re ready to clean your dataset and start your analysis!

Back to top