Detect Repeats
The DetectRepeats function can find tandem or interspersed repeats in gene, protein, or genome sequences.How do I detect tandem repeats?
First it is necessary to install DECIPHER and load the library in R. Next, upload sequences and run DetectRepeats to identify any tandem repeats.library(DECIPHER)
> options(width=50)
>
> # load the DECIPHER library in R
> library(DECIPHER)
>
> # specify the path to the FASTA file (in quotes)
> fas <- "<<REPLACE WITH PATH TO FASTA FILE>>"
>
> # load the sequences from the file
> # change "DNA" to "RNA" or "AA" as needed
> seqs <- readDNAStringSet(fas)
>
> # look at some of the sequences (optional)
> seqs
DNAStringSet object of length 11:
width seq names
[1] 107976 ATGACAA...CATTTAA titin isoform IC
[2] 26778 ATGGATC...GCGCTGA obscurin isoform c
[3] 16908 ATGATTT...ATACTAA hemicentin-1 prec...
[4] 5847 ATGGCGC...AACCTAA receptor-type tyr...
[5] 5745 ATGGGGG...AGAGTGA myosin light chai...
... ... ...
[7] 3825 ATGCCTG...TCAGTGA myosin-binding pr...
[8] 1497 ATGTTTA...ACTTTAA myotilin isoform a
[9] 1263 ATGCTCA...CCTCTGA mannose-1-phospha...
[10] 1101 ATGGTGC...TATCTAA inhibin alpha cha...
[11] 717 ATGTCTA...GGCCTGA SPEG neighbor pro...
>
> # detect tandem repeats
> TRs <- DetectRepeats(seqs)
|========================================| 100%
Time difference of 68.85 secs
>
> # view the first tandem repeats
> head(TRs)
Index Begin End Left Right
1 1 844 972 844, 865.... 864, 897....
2 1 1287 2114 1287, 14.... 1436, 15....
3 1 4232 4332 4232, 42.... 4246, 42....
4 1 8390 9719 8390, 86.... 8650, 89....
5 1 16194 17591 16194, 1.... 16472, 1....
6 1 24926 28615 24926, 2.... 25213, 2....
Score
1 21.46971
2 81.77987
3 39.66848
4 33.95958
5 25.50887
6 37.54636
>
> # view the last tandem repeats
> tail(TRs)
Index Begin End Left Right
19 2 24506 24589 24506, 24548 24547, 24589
20 3 5500 12591 5500, 57.... 5781, 60....
21 3 13581 14606 13581, 1.... 13751, 1....
22 3 15270 15644 15270, 1.... 15389, 1....
23 5 2677 2844 2677, 2761 2760, 2844
24 5 3037 3234 3037, 30.... 3072, 31....
Score
19 10.95497
20 250.98565
21 82.65828
22 16.39137
23 15.51914
24 45.80374
>
> # see the help page for more examples
> ?DetectRepeats