In the beginning...

Posted by Rafael Martinez-Feria

Today, the R programming language is being adopted by an ever greater number of academics and enterprise users for wide array of applications. From big data analytics to database management (through integration with SQL), and interactive web data apps, the R community is a growing force, challenging more traditional statistical and analytics platforms. But, what makes R so special? Where does R come from anyway?

A good place to start searching for answers is Ihaka and Gentleman’s 2001 paper “R: A Language for Data Analysis and Graphics”. This paper gives some interesting insights on how the R language was developed.

R starts with S

R is a programing language that was developed specifically for statistical analytics and graphics. As such, much of its functionality is geared to manipulating and storing data.

Ihaka and Gentleman’s describe that the key to R was to integrate useful features from the scheme language with a S-like syntax. So we can think that R is basically a different implementation of the S language and environment. For its part, S is a 40-year old language developed at Bell Laboratories. The video below contains a lecture by Richard Becker describing a bit of history of how the S language came about.

Of course, there are some important differences from S and scheme. But the syntax is similar enough that much of the S code runs in R. The happy marriage provided advantages in key areas such as portability, computational efficiency, memory management, and scoping.

The important differences

n <- 10
count <- function(n = 0)
  function() {
    n <<- n + 1
    return(n)
    }

counter <- count()
counter()
## [1] 1
counter()
## [1] 2
counter()
## [1] 3

Summary

With its S-like syntax and its scheme legacy, R offers a relatively straight forward, yet flexible language for of users to develop a wide array of applications.