Taxon name resolution service (tnrs) applied to a vector of names by batches
Source:R/opentree_taxonomy_general.R
tnrs_match.Rd
Taxon name resolution service (tnrs) applied to a vector of names by batches
Usage
tnrs_match(input, reference_taxonomy, tip, ...)
# S3 method for default
tnrs_match(input, reference_taxonomy = "ott", ...)
# S3 method for phylo
tnrs_match(input, reference_taxonomy = "ott", tip = NULL, ...)
Arguments
- input
A character vector of taxon names, or a phylo object with tip names, to be matched to taxonomy.
- reference_taxonomy
A character vector specifying the reference taxonomy to use for TNRS. Options are "ott", "ncbi", "gbif" or "irmng". The function defaults to "ott".
- tip
A vector of mode numeric or character specifying the tips to match. If left empty all tips will be matched.
- ...
Arguments passed on to
rotl::tnrs_match_names
context_name
name of the taxonomic context to be searched (length-one character vector or
NULL
). Must match (case sensitive) one of the values returned bytnrs_contexts
. Default to "All life".do_approximate_matching
A logical indicating whether or not to perform approximate string (a.k.a. “fuzzy”) matching. Using
FALSE
will greatly improve speed. Default, however, isTRUE
.ids
A vector of ids to use for identifying names. These will be assigned to each name in the names array. If ids is provided, then ids and names must be identical in length.
include_suppressed
Ordinarily, some quasi-taxa, such as incertae sedis buckets and other non-OTUs, are suppressed from TNRS results. If this parameter is true, these quasi-taxa are allowed as possible TNRS results.
Details
There is no limit to the number of names that can be queried and matched.
The output will preserve all elements from original input phylo object and will add
- phy$mapped
A character vector indicating the state of mapping of phy$tip.labels:
- phy$original.tip.label
A character vector preserving all original labels.
- phy$ott_ids
A numeric vector with ott id numbers of matched tips. Unmatched and original tips will be NaN.
if tips are duplicated, tnrs will only be run once (avoiding increases in function running time) but the result will be applied to all duplicated tip labels
Examples
tnrs_match(input = c("Mus"))
#> ---> Runnning TNRS to match input names to reference taxonomy (OTT).
#>
|
| | 0%
|
|======================================================================| 100%
#> search_string unique_name approximate_match score ott_id
#> 1 mus Mus (genus in Deuterostomia) FALSE 1 1068778
#> is_synonym flags number_matches
#> 1 FALSE SIBLING_HIGHER 3
tnrs_match(input = c("Mus", "Mus musculus"))
#> ---> Runnning TNRS to match input names to reference taxonomy (OTT).
#>
|
| | 0%
|
|=================================== | 50%
|
|======================================================================| 100%
#> search_string unique_name approximate_match score ott_id
#> 1 mus Mus (genus in Deuterostomia) FALSE 1 1068778
#> 2 mus musculus Mus musculus FALSE 1 542509
#> is_synonym flags number_matches
#> 1 FALSE SIBLING_HIGHER 3
#> 2 FALSE 1
tnrs_match(input = c("Mus", "Echinus", "Hommo", "Mus"))
#> ---> Runnning TNRS to match input names to reference taxonomy (OTT).
#>
|
| | 0%
|
|======================= | 33%
|
|=============================================== | 67%
|
|======================================================================| 100%
#> search_string unique_name approximate_match score
#> 1 mus Mus (genus in Deuterostomia) FALSE 1.00
#> 2 echinus Echinus (genus in Opisthokonta) FALSE 1.00
#> 3 hommo Homo TRUE 0.75
#> 1.1 mus Mus (genus in Deuterostomia) FALSE 1.00
#> ott_id is_synonym flags number_matches
#> 1 1068778 FALSE SIBLING_HIGHER 3
#> 2 616574 FALSE sibling_higher 4
#> 3 770309 FALSE sibling_higher 1
#> 1.1 1068778 FALSE SIBLING_HIGHER 3