This function allows to search and download data from the the NCBI Nucleotide database. Additionally it uses the NCBI Taxonomy database to get the sequence taxonomic classification.
refdb_import_NCBI(
query,
full = FALSE,
max_seq_length = 10000,
seq_bin = 200,
verbose = TRUE,
start = 0L
)a character string with the query.
a logical. If FALSE (the default), only a subset of the most important fields is included in the result.
a numeric giving the maximum length of sequences to retrieve. Useful to exclude complete genomes.
number of sequences to download at once.
print information in the console.
an integer giving the index where to start to download. For debugging purpose mainly.
A tibble.
This function uses several functions of the rentrez package to interface with the NCBI's EUtils API.*
Error in curl::curl_fetch_memory(url, handle = handle) :
transfer closed with outstanding read data remaining
This error seems to appear with long sequences.
You can try to decrease max_seq_length to exclude them.
if (FALSE) { # \dontrun{
silo_ncbi <- refdb_import_NCBI("Silo COI")
} # }