This function uses the generalized Levenshtein (edit) distance to identify possible issue with taxonomic names.
refdb_check_tax_typo(x, tol = 1)A list of two-columns tibbles reporting for each taxonomic level
the pairs of taxonomic names sharing the same upstream taxonomy and for
which the generalized Levenshtein (edit) distance is below
the tol value.
lib <- read.csv(system.file("extdata", "ephem.csv", package = "refdb"))
lib <- refdb_set_fields(lib,
taxonomy = c(family = "family_name",
genus = "genus_name",
species = "species_name"),
sequence = "DNA_seq",
marker = "marker")
refdb_check_tax_typo(lib)
#> $family_name
#> NULL
#>
#> $genus_name
#> NULL
#>
#> $species_name
#> # A tibble: 2 × 2
#> `Taxon 1` `Taxon 2`
#> <chr> <chr>
#> 1 Baetis tricaudatus B BG Baetis tricaudatus D BG
#> 2 Cinygmula sp. B BG Cinygmula sp. C BG
#>