This function uses the generalized Levenshtein (edit) distance to identify possible issue with taxonomic names.

refdb_check_tax_typo(x, tol = 1)

Arguments

x

a reference database.

tol

the edit distance below which two taxonomic names are reported.

Value

A list of two-columns tibbles reporting for each taxonomic level the pairs of taxonomic names sharing the same upstream taxonomy and for which the generalized Levenshtein (edit) distance is below the tol value.

Examples

lib <- read.csv(system.file("extdata", "ephem.csv", package = "refdb"))
lib <- refdb_set_fields(lib,
                        taxonomy = c(family = "family_name",
                            genus = "genus_name",
                            species = "species_name"),
                        sequence = "DNA_seq",
                        marker = "marker")
refdb_check_tax_typo(lib)
#> $family_name
#> NULL
#> 
#> $genus_name
#> NULL
#> 
#> $species_name
#> # A tibble: 2 × 2
#>   `Taxon 1`               `Taxon 2`              
#>   <chr>                   <chr>                  
#> 1 Baetis tricaudatus B BG Baetis tricaudatus D BG
#> 2 Cinygmula sp. B BG      Cinygmula sp. C BG     
#>