Skip to contents

This function identifies and removes invalid geographic coordinates, including non-numeric values, NA or empty values, and coordinates outside the valid range for Earth (latitude > 90 or < -90, and longitude > 180 or < -180).

Usage

remove_invalid_coordinates(
  occ,
  long = "decimalLongitude",
  lat = "decimalLatitude",
  return_invalid = TRUE,
  save_invalid = FALSE,
  output_dir = NULL,
  overwrite = FALSE,
  output_format = ".gz",
  verbose = FALSE
)

Arguments

occ

(data.frame or data.table) a dataset with occurrence records.

long

(character) column name in occ with the longitude.

lat

(character) column name in occ with the latitude.

return_invalid

(logical) whether to return a list containing the valid and invalid coordinates. Default is TRUE.

save_invalid

(logical) whether to save the invalid (removed) records. If TRUE, an output_dir must be provided. Default is FALSE.

output_dir

(character) path to an existing directory where records with invalid coordinates will be saved. Only used when save_invalid = TRUE.

overwrite

(logical) whether to overwrite existing files in output_dir. Only used when save_invalid = TRUE. Default is FALSE.

output_format

(character) output format for saving removed records. Options are ".csv" or ".gz". Only used when save_invalid = TRUE. Default is ".gz".

verbose

(logical) whether to print messages about function progress. Default is TRUE.

Value

If return_invalid = FALSE, returns the occurrence dataset containing only valid coordinates. If return_invalid = TRUE (default), returns a list with two elements:

  • valid – the dataset with valid coordinates.

  • invalid – the dataset with invalid coordinates removed.

Examples

# Create fake data example
occ <- data.frame("species" = "spp",
                  "decimalLongitude" = c(10, -190, 20, 50, NA),
                  "decimalLatitude" = c(20, 20, 240, 50, NA))
# Split valid and invalid coordinates
occ_valid <- remove_invalid_coordinates(occ)