IDTAXA Classify Functions - Code example:
Below is an example of using DECIPHER's IdTaxa function to classify sequences.
Instructions
First it is necessary to install DECIPHER and load the library in R. Next, set the "fas" variable to the path to the FASTA file of sequences (e.g., "~/mySeqs.fas").# load the DECIPHER library in R
> library(DECIPHER)
>
> # specify the path to the FASTA file (in quotes)
> fas <- "<<REPLACE WITH PATH TO FASTA FILE>>"
>
> # load the sequences from the file
> seqs <- readAAStringSet(fas)
>
> # remove any gaps (if needed)
> seqs <- RemoveGaps(seqs)
>
> # for help, see the IdTaxa help page (optional)
> ?IdTaxa
>
> # load a training set object (trainingSet)
> # see http://DECIPHER.codes/Downloads.html
> load("<<REPLACE WITH PATH TO RData file>>")
>
> # classify the sequences
> ids <- IdTaxa(seqs,
+ trainingSet,
+ threshold=50, # 60 (cautious) or 50 (sensible)
+ processors=NULL) # use all available processors
|============================================| 100%
Time difference of 135.83 secs
>
> # look at the results
> print(ids)
A test set of class 'Taxa' with length 1000
confidence name taxon
[1] 78.0% ENA|OBRS01158965|... Root; Bacter...
[2] 44.7% ENA|OBRS01551965|... Root; unclas...
[3] 74.8% ENA|OBRS01920881|... Root; Bacter...
[4] 15.9% ENA|OBRS01851995|... Root; unclas...
[5] 19.7% ENA|OBRS01760119|... Root; unclas...
... ... ... ...
[996] 54.0% ENA|OBRS01119407|... Root; unclas...
[997] 56.0% ENA|OBRS01447422|... Root; unclas...
[998] 51.5% ENA|OBRS01883532|... Root; unclas...
[999] 64.7% ENA|OBRS01350537|... Root; Bacter...
[1000] 47.5% ENA|OBRS01488581|... Root; unclas...
> plot(ids)