DECIPHER - R Lesson #3

R Lesson #3 - Logicals operations

Both integers and logicals require 32 bits per element in R. The primary difference is that logicals can only be TRUE, FALSE, or NA. Logicals are ubiquitous, as they are the output of many comparisons and other functions, as demonstrated below.

Hide output

# coercing a numeric to integer
> as.integer(1.2) # truncated toward zero
[1] 1
> as.integer(-1.2) # truncated toward zero
[1] -1
> 
> # several useful functions that return doubles
> floor(1.2)
[1] 1
> ceiling(1.2)
[1] 2
> round(1.2, digits=2)
[1] 1.2
> 
> # is.numeric is different than is.double
> is.numeric(1.2) # interpretable as number
[1] TRUE
> is.numeric(1L) # integers are also TRUE
[1] TRUE
> is.double(1.2)
[1] TRUE
> is.double(1L)
[1] FALSE
> 
> # coercion to logical: 0 = FALSE, otherwise TRUE
> as.logical(0)
[1] FALSE
> is.logical(1.2)
[1] FALSE
> as.logical(1.2)
[1] TRUE
> is.logical(as.logical(1.2))
[1] TRUE
> 
> # comparisons return a logical vector
> x <- 1
> y <- 2
> x > y
[1] FALSE
> x >= y
[1] FALSE
> x == y
[1] FALSE
> 2*x == y # multiplication before equality
[1] TRUE
> 
> # accessing help for any function
> ?Syntax # order of operations
> ?round # obtain help for any function
> ?`+` # some functions require back ticks
> 
> # initialize a logical vector
> z <- logical(100)
> object.size(z)/length(z)
4.4 bytes
> # Note that each logical element is stored in 4 bytes

Several special names, such as "TRUE", are reserved in R and cannot be used as variable names. Other names are very common, such as "c", and it is good practice not to use these names as variable names even though it is allowed.

z[2:3] <- TRUE
> head(z)
[1] FALSE  TRUE  TRUE FALSE FALSE FALSE
> z[5] <- T # T is a reserved word
> head(z)
[1] FALSE  TRUE  TRUE FALSE  TRUE FALSE
> 
> # common names should not be changed
> T <- "hello" # Never do this!
> z[5] <- T
> head(z) # characters now!
[1] "FALSE" "TRUE"  "TRUE"  "FALSE" "hello" "FALSE"
> rm(T) # remove the variable `T`
> T # everything is back to normal
[1] TRUE
> 
> # Reserved names do not permit assignment
> ?Reserved
> TRUE <- "hello" # Error
Error in TRUE <- "hello" : invalid (do_set) left-hand side to assignment
> # common names to avoid overwriting:
> c # function names
function (..., recursive = FALSE)  .Primitive("c")
> T # TRUE shorthand
[1] TRUE
> F # FALSE shorthand
[1] FALSE
> letters # lower-case letters
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l"
[13] "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x"
[25] "y" "z"
> LETTERS # upper-case letters
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L"
[13] "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X"
[25] "Y" "Z"
> pi # 3.14...
[1] 3.141593

When performing comparisons, it is important to keep in mind that doubles are stored with finite precision. Therefore, comparisons between seemingly identical numbers may give an unexpected result. This is rarely encountered, but can be a nuisance when it happens. One solution is to use the all.equal function, which test for near equality. Another work-around is to use integers in comparisons, because integers are stored exactly.

x1 <- 0.5 - 0.3
> x2 <- 0.3 - 0.1
> x1 == x2 # comparison of numerics
[1] FALSE
> x1 != x2
[1] TRUE
> print(x1, digits=22)
[1] 0.2000000000000000111022
> print(x2, digits=22)
[1] 0.1999999999999999833467
> all.equal(x1, x2) # TRUE
[1] TRUE
> # but most of the time equality works as expected
> 2/3 == 2*(1/3) # TRUE
[1] TRUE

The double and single logical operators are illustrated below. Single operators compare the inputs in pairs of elements. Double operators only use the first element of each input, and therefore are preferred in situations where only one logical is required.

a <- c(T, T, F, F)
> a
[1]  TRUE  TRUE FALSE FALSE
> !a
[1] FALSE FALSE  TRUE  TRUE
> b <- c(T, F, T, F)
> a & b # Truth table for AND operation
[1]  TRUE FALSE FALSE FALSE
> a | b # Truth table for OR operation
[1]  TRUE  TRUE  TRUE FALSE
> a && b # only uses the first element with no Warning!
[1] TRUE
> a || b # only uses the first element with no Warning!
[1] TRUE
> a[2] && b[2]
[1] FALSE
> a[2] || b[2]
[1] TRUE
> xor(a, b) # Exclusive OR (XOR)
[1] FALSE  TRUE  TRUE FALSE

The which function is very useful. It takes a logical input and returns the indices that are TRUE. The which.max function is a variant that returns the index of the first maximum element. Here, some of the functions for generating random numbers are introduced: runif for drawing from a uniform distribution, and rnorm for drawing from a normal distribution.

r <- runif(100) # 100 random numbers
> r # roughly uniformly distributed between 0 and 1
  [1] 0.85667156 0.34026964 0.44424143 0.48037833
  [5] 0.87667115 0.65841114 0.92110802 0.21177947
  [9] 0.30524428 0.12875139 0.46554814 0.73656323
 [13] 0.78056957 0.35762074 0.07626947 0.03351947
 [17] 0.77606616 0.06700093 0.52778631 0.48712420
 [21] 0.66335777 0.56610838 0.44025036 0.63570422
 [25] 0.99940558 0.20624952 0.31441187 0.77470678
 [29] 0.63578651 0.65580460 0.86104108 0.37896363
 [33] 0.76623888 0.22145695 0.25397960 0.07959553
 [37] 0.72126320 0.69776325 0.09309901 0.56508622
 [41] 0.25661917 0.82514489 0.56554523 0.06238280
 [45] 0.96626279 0.74489289 0.13250879 0.46242389
 [49] 0.96865750 0.74094259 0.38335198 0.66505184
 [53] 0.47497121 0.03714927 0.69987399 0.13834235
 [57] 0.68731832 0.18809398 0.60494391 0.33094660
 [61] 0.80798206 0.44554323 0.07992541 0.93656027
 [65] 0.06502416 0.78303341 0.94759327 0.06641417
 [69] 0.48330766 0.24743624 0.96726025 0.52037367
 [73] 0.17429884 0.33773523 0.02584235 0.13105126
 [77] 0.23764974 0.98281393 0.77911002 0.19064267
 [81] 0.58837599 0.94003876 0.39637532 0.38549449
 [85] 0.36373678 0.61239700 0.72690764 0.37245687
 [89] 0.92261563 0.46138683 0.89940080 0.87923886
 [93] 0.30293246 0.57432447 0.68368253 0.75951094
 [97] 0.99441555 0.13772023 0.65303211 0.92292185
> r < 0.5
  [1] FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE
  [9]  TRUE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE
 [17] FALSE  TRUE FALSE  TRUE FALSE FALSE  TRUE FALSE
 [25] FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE
 [33] FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE FALSE
 [41]  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE  TRUE
 [49] FALSE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE
 [57] FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE
 [65]  TRUE FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE
 [73]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE
 [81] FALSE FALSE  TRUE  TRUE  TRUE FALSE FALSE  TRUE
 [89] FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE
 [97] FALSE  TRUE FALSE FALSE
> # `which` takes a logical input
> w <- which(r < 0.5)
> w # indices that are TRUE, where r < 0.5
 [1]  2  3  4  8  9 10 11 14 15 16 18 20 23 26 27 32
[17] 34 35 36 39 41 44 47 48 51 53 54 56 58 60 62 63
[33] 65 68 69 70 73 74 75 76 77 80 83 84 85 88 90 93
[49] 98
> r[w]
 [1] 0.34026964 0.44424143 0.48037833 0.21177947
 [5] 0.30524428 0.12875139 0.46554814 0.35762074
 [9] 0.07626947 0.03351947 0.06700093 0.48712420
[13] 0.44025036 0.20624952 0.31441187 0.37896363
[17] 0.22145695 0.25397960 0.07959553 0.09309901
[21] 0.25661917 0.06238280 0.13250879 0.46242389
[25] 0.38335198 0.47497121 0.03714927 0.13834235
[29] 0.18809398 0.33094660 0.44554323 0.07992541
[33] 0.06502416 0.06641417 0.48330766 0.24743624
[37] 0.17429884 0.33773523 0.02584235 0.13105126
[41] 0.23764974 0.19064267 0.39637532 0.38549449
[45] 0.36373678 0.37245687 0.46138683 0.30293246
[49] 0.13772023
> 
> # all returns TRUE if all inputs are TRUE
> all(r[w]==r[r < 0.5])
[1] TRUE
> # any returns TRUE if any input is TRUE
> any(r > 1)
[1] FALSE
> 
> min(r)
[1] 0.02584235
> max(r)
[1] 0.9994056
> mean(r)
[1] 0.5178385
> which.min(r)
[1] 75
> which(r==min(r))
[1] 75
> r <- c(r, r) # repeat `r` twice
> which.max(r) # first occurrence only
[1] 25
> which(r==max(r)) # all occurrences
[1]  25 125

It is also possible to set the random number generator's seed. This results in the same series of random numbers every time, which is useful in some cases.

set.seed(123L)
> r1 <- rnorm(100) # 100 randomly distributed numbers
> set.seed(123L)
> r2 <- rnorm(100) # restart from the same point
> all(r1 == r2) # the same set of random numbers
[1] TRUE
> r3 <- rnorm(100) # continue without restarting
> any(r1 == r3) # all different random numbers
[1] FALSE
> length(which(r1 > r3))/length(r3) # about half
[1] 0.58
> set.seed(NULL) # re-initializes the seed

< Previous Lesson Next Lesson >