IDTAXA Classifier - Frequently Asked Questions:
A Murali et al. (2018) "IDTAXA: a novel approach for accurate taxonomic classification of microbiome sequences." Microbiome, doi:10.1186/s40168-018-0521-5.
Yes, please install DECIPHER and then look at the code page.
It depends on the training set. For 16S training sets classification is up to the genus level because there are no species labels. Note that it is well known that the 16S gene is too conserved to obtain species level identification, even when the full length sequence is available. This has been shown by a number of published studies, and we have confirmed this in our own work (see here). For example, strains with identical 16S sequences can have as little as 40% gene content similarity, making it impossible to ascertain anything that resembles a species level classification even with full length error-free sequences.
For 16S we recommend the GTDB training set because it is the most recently published reference taxonomy. The SILVA training set has the most breadth, so it is likely to yield taxonomic names for the most sequences, although some names might be esoteric. Please be aware of the licensing information associated with use of the SILVA dataset. The Contax training set is based on agreement among multiple reference taxonomies, so it probably contains the least labeling error (where training sequences are misassigned) but also least breadth.