Basic R

Ranae Dietzel & Andee Kaplan

Basic R

R as a huge calculator

#addition, subtraction, multiplications, powers
2 + 3
## [1] 5
3^3
## [1] 27
2/7
## [1] 0.2857143
3*108
## [1] 324
#modulo (remainder)
33%%4 
## [1] 1

Using functions

Functions are reusable pieces of code. R has a ton of functions built in, and even more that you can load separately (packages)

#pass values to functions as parameters
log(2, base = 2)
## [1] 1
exp(1)
## [1] 2.718282
cos(2*pi)
## [1] 1

Storing values

We can store numbers or the results of calling functions for later use in variables using the assignment operator <-.

#assignment
x <- 10
y <- cos(2*pi)

We can then use these variables later, for more calculation

#calculation on variable values
exp(x)
## [1] 22026.47
x^y
## [1] 10

Why the funny arrow assignment?

Why don’t we just use = (which also works)?

Variable creation tips

Vectors

We can store multiple values in a variable using vectors. To create a vector, we can combine values (c), or use sequencing functions. Each element in a vector must be of the same type.

#combine values
(nightstand_books <- c("Harry Potter and the Goblet of Fire", "A Storm of Swords", "On Writing Well", "Advanced R", "R Packages"))
## [1] "Harry Potter and the Goblet of Fire"
## [2] "A Storm of Swords"                  
## [3] "On Writing Well"                    
## [4] "Advanced R"                         
## [5] "R Packages"
#sequences
(nums <- 1:10)
##  [1]  1  2  3  4  5  6  7  8  9 10
(nums_2 <- seq(1, 10, by = 2))
## [1] 1 3 5 7 9

Your turn

helpful functions

  1. Use the rep function to create and store a vector containing 111222333444555
  2. Use the rep function to create and store a vector containing 123451234512345
  3. What is the third power of each integer from 1 to 50?

Data frames

head(mtcars)
##                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
## Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
## Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
## Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
## Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
## Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
## Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Accessing parts of data ($)

Depending on what we have (data frame vs. vector), there are multiple ways to access part of the data.

For data frames, we can access columns using the $ operator

#accessing data using $
head(mtcars$mpg)
## [1] 21.0 21.0 22.8 21.4 18.7 18.1

Accessing parts of data (indices)

We can also use numeric indexing (starts at 1). Or, we can access columns in a data frame using their string names

#accessing data using numeric indexing
mtcars[1:6, 1]
## [1] 21.0 21.0 22.8 21.4 18.7 18.1
mtcars$mpg[1]
## [1] 21
#accessing data using string names of columns
mtcars[c(1, 3, 5), "mpg"]
## [1] 21.0 22.8 18.7

Logicals

R has support for logicals with TRUE and FALSE as built in Boolean values. We can create logicals with <, >, <=, >=, ==, !=, is.na, etc.

#logicals
3 < 5
## [1] TRUE
x <- c(2, 4, 5, NA, 100)
is.na(x)
## [1] FALSE FALSE FALSE  TRUE FALSE

We can combine logical statements using and & and or ||.

Indexing with logicals

We can also use logical statements to access parts of our data (think filtering).

#indexing with logicals
x[!is.na(x)]
## [1]   2   4   5 100
x[x > 10]
## [1]  NA 100
x[x > 10 & !is.na(x)]
## [1] 100

Your turn

Using the mtcars dataset,

  1. Which car has greater than 30 mpg and over 100 horsepower?
  2. How many cars have 4 cylinders? Hint: Using the sum function on logicals returns how many TRUE values there are.
  3. Calculate the average horsepower for the cars with 4 cylinders.

Modifying data

We can use indexing to change values of data. Or add columns to a data frame.

#using indexing to change certain values
hp_per_cyl <- mtcars$hp/mtcars$cyl
hp_per_cyl[1] <- NA
hp_per_cyl[2] <- "???"
hp_per_cyl[1:10]
##  [1] NA                 "???"              "23.25"           
##  [4] "18.3333333333333" "21.875"           "17.5"            
##  [7] "30.625"           "15.5"             "23.75"           
## [10] "20.5"
#add column
mtcars$hp_per_cyl <- hp_per_cyl
head(mtcars, 1)
##           mpg cyl disp  hp drat   wt  qsec vs am gear carb hp_per_cyl
## Mazda RX4  21   6  160 110  3.9 2.62 16.46  0  1    4    4       <NA>

Data types

#structure
str(mtcars[, 1:3])
## 'data.frame':    32 obs. of  3 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
#type conversion
as.character(mtcars$mpg)[1:6]
## [1] "21"   "21"   "22.8" "21.4" "18.7" "18.1"

Moar functions

There are a lot more functions. Here are some examples.

#moar functions
x <- mtcars$mpg
length(x)
## [1] 32
sum(x)
## [1] 642.9

Stats

#basic statistical functions
mean(x)
## [1] 20.09062
sd(x)
## [1] 6.026948
summary(x)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   10.40   15.42   19.20   20.09   22.80   33.90
quantile(x, probs = 0.5)
##  50% 
## 19.2