R Basics


R Basics

1. Installing Packages

R Packages can be installed in (at least) three ways.

Package Source Method
CRAN install.packages("tidyverse")

Load Installed Packages

Type “library” command to load the installed package for use. You install a package once, but run “library” every time you open a new R window.

library("tidyverse")

2. R as a Calculator

R as an Ordinary Calculator

Name Action Example
“+” Add 5+2
“-” Subtract 5-2
“*“ Multiply 5*2
“/” Divide 5/2
“^” Power 5^2
“%%” Quotient in integer division 5%%2
“%/%” Remainder in integer dividion 5%/%2
“pi” Constant “pi” 2*pi
“exp(1)” Constant “e” exp(pi)

R as a Scientific Calculator

Now that you have R installed, let us use the software for data analysis. This section covers simple mathematical operations so that R can replace your scientific calculator.

Name Action Example
abs Absolute number x=-7; abs(x)
sqrt Square root of a number sqrt(2)
exp Exponential function exp(2.7)
log Natural logarithm log(2.7)
log10 Logarithm with base 10 log10(2.7)

3. Piping

Piping operator comes from “tidyverse”. Make sure you load the package.

Name Action Example
%>% Rewrites function without parenthesis 2 %>% sqrt %>% log

The following two commands are equivalent.

sqrt(2)

## [1] 1.414214

2 %>% sqrt

## [1] 1.414214

Multistep piping -

22 %>% exp %>% log

## [1] 22

You can read it as ‘take 22 and do exponential and do log’. These two steps, one after another, should give you back the original number.

4. Vectors

R programming language is built on top of vectors. Here we show five ways to create them.

Different method for creating vectors
Method Description
Function ‘c’ Vector with given values
Function ‘:’ Vector with a range of numbers
Function ‘seq’ Vector with equal spacing
Function ‘rep’ Vector with identical numbers
Function ‘sample’ Random vector

(i) The function ‘c’

Use the function ‘c’ to create an arbitrary vector. After a vector is created, it can be accessed entirely or by positions. Remember that the index of the first position is 1, not zero like other programming languages (C, Java, Python).

x=c(22,33,44,54,1,2,97)
x[1]

## [1] 22
Character
c("John", "Juan", "Jason")

## [1] "John"  "Juan"  "Jason"
Logical Vector
c(TRUE, FALSE, TRUE)

## [1]  TRUE FALSE  TRUE

(ii) The function ‘:’

Increasing integers -

2:10

## [1]  2  3  4  5  6  7  8  9 10

Decreasing integers -

7:3

## [1] 7 6 5 4 3

(iii) The function ‘seq’

seq(3,10,2)

## [1] 3 5 7 9

(iv) The function ‘rep’

rep(5,10)

##  [1] 5 5 5 5 5 5 5 5 5 5

(v) The function ‘sample’

sample(c('H','T'),10,replace=TRUE)

##  [1] "T" "T" "H" "T" "H" "T" "H" "H" "H" "T"

There are additional functions to create random vectors with binomial, Gaussian (bell curve) and other distributions.

Concatenating Vectors

You can also combine vectors generated by the above methods. The function ‘c’ merges vectors of different types.

v1=1:10
v2=c(1,2,7,11)

c(v1,v2)

##  [1]  1  2  3  4  5  6  7  8  9 10  1  2  7 11

5. Functions Operating on Vectors

A number of functions do not exist on scientific calculators, because they apply only on vectors. Sum of a vector is a good example.

Name Action
head First few elements of a vector
tail Last few elements of a vector
sum Sum of elements of a number vector
mean Mean of elements of a number vector
median Median
sd Standard Deviation
var Variance
summary Summary statistics
table Counts the elements of a vector
vec = c(1, 22,33,44,54,1,2,97, 22)
vec %>% head

## [1]  1 22 33 44 54  1

vec %>% head(1)

## [1] 1

vec %>% tail(3)

## [1]  2 97 22

vec %>% sum

## [1] 276

vec %>% mean

## [1] 30.66667

vec %>% median

## [1] 22

vec %>% sd

## [1] 31.34486

vec %>% var

## [1] 982.5

vec %>% summary

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.00    2.00   22.00   30.67   44.00   97.00

vec %>% table

## .
##  1  2 22 33 44 54 97 
##  2  1  2  1  1  1  1

7. Data Visualization

Core R comes with a number of plotting functions, but here we cover only two - hist (to draw histogram) and plot (to draw scatterplots). For more extensive plotting tasks, we recommend the readers to learn and use the powerful ggplot package.

Name Action
hist Draw histogram
plot Draw scatterplot

Using hist()

x=c(rep(1,10),rep(2,10),rep(3,10))
hist(x)

Using plot()

The plot() function can be used to draw scatter-plots. I takes two equal-sized vectors as input and draws all corresponding points from the vector as (x,y).

v1=c(1,3,8,9,12)
v2=c(3,4,5,1,2)
plot(v1,v2)

Back to blog