Gallery - Example 7
Below is the R code to reproduce the example:library(DECIPHER)
Loading required package: Biostrings
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, aperm, append,
as.data.frame, basename, cbind,
colnames, dirname, do.call, duplicated,
eval, evalq, Filter, Find, get, grep,
grepl, intersect, is.unsorted, lapply,
Map, mapply, match, mget, order, paste,
pmax, pmax.int, pmin, pmin.int,
Position, rank, rbind, Reduce, rownames,
sapply, setdiff, table, tapply, union,
unique, unsplit, which.max, which.min
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: ‘S4Vectors’
The following object is masked from ‘package:utils’:
findMatches
The following objects are masked from ‘package:base’:
expand.grid, I, unname
Loading required package: IRanges
Loading required package: XVector
Loading required package: GenomeInfoDb
Attaching package: ‘Biostrings’
The following object is masked from ‘package:base’:
strsplit
>
> # load the H5N1 HA protein coding gene sequence
> dna <- readDNAStringSet("InfluenzaA_hemagglutinin_CDS.fas.gz")
>
> # align sequences
> ali <- AlignPairs(dna, dna,
+ pairs=data.frame(Pattern=1, Subject=seq_along(dna)),
+ type="both",
+ processors=NULL)
|========================================| 100%
Time difference of 13.85 secs
>
> # remove sequences with internal indels
> dna <- ali$SubjectAligned[lengths(ali$Result$PatternGapPosition) == 0L]
>
> x <- InferSelection(dna, windowSize=1, showPlot=TRUE)
LnL = -39816.395
Time difference of 116.48 secs
>
> head(x) # parameter estimates
LogLikelihood theta kappa
-3.981640e+04 4.376473e+00 1.555070e+00
omega 1 omega 2 omega 3
4.539993e-05 4.034462e-02 5.905828e+02
>
> tail(x) # statistical significance
pvalue 556 pvalue 557 pvalue 558
1.722166e-06 3.547689e-06 1.940913e-03
pvalue 559 pvalue 560 pvalue 561
1.427480e-06 3.295784e-09 1.000000e+00
>
> # codons under strong positive selection
> which(x[startsWith(names(x), "omega")] > 2 &
+ x[startsWith(names(x), "pvalue")] < 0.05)
omega 540
540