R Basics
By admin on October 21, 2019
R Basics
1. Installing Packages
R Packages can be installed in (at least) three ways.
Package Source | Method | |
---|---|---|
CRAN | install.packages("tidyverse") |
Load Installed Packages
Type “library” command to load the installed package for use. You install a package once, but run “library” every time you open a new R window.
library("tidyverse")
2. R as a Calculator
R as an Ordinary Calculator
Name | Action | Example |
---|---|---|
“+” | Add | 5+2 |
“-” | Subtract | 5-2 |
“*“ | Multiply | 5*2 |
“/” | Divide | 5/2 |
“^” | Power | 5^2 |
“%%” | Quotient in integer division | 5%%2 |
“%/%” | Remainder in integer dividion | 5%/%2 |
“pi” | Constant “pi” | 2*pi |
“exp(1)” | Constant “e” | exp(pi) |
R as a Scientific Calculator
Now that you have R installed, let us use the software for data analysis. This section covers simple mathematical operations so that R can replace your scientific calculator.
Name | Action | Example |
---|---|---|
abs | Absolute number | x=-7; abs(x) |
sqrt | Square root of a number | sqrt(2) |
exp | Exponential function | exp(2.7) |
log | Natural logarithm | log(2.7) |
log10 | Logarithm with base 10 | log10(2.7) |
3. Piping
Piping operator comes from “tidyverse”. Make sure you load the package.
Name | Action | Example |
---|---|---|
%>% | Rewrites function without parenthesis | 2 %>% sqrt %>% log |
The following two commands are equivalent.
sqrt(2)
## [1] 1.414214
2 %>% sqrt
## [1] 1.414214
Multistep piping -
22 %>% exp %>% log
## [1] 22
You can read it as ‘take 22 and do exponential and do log’. These two steps, one after another, should give you back the original number.
4. Vectors
R programming language is built on top of vectors. Here we show five ways to create them.
Method | Description |
---|---|
Function ‘c’ | Vector with given values |
Function ‘:’ | Vector with a range of numbers |
Function ‘seq’ | Vector with equal spacing |
Function ‘rep’ | Vector with identical numbers |
Function ‘sample’ | Random vector |
(i) The function ‘c’
Use the function ‘c’ to create an arbitrary vector. After a vector is created, it can be accessed entirely or by positions. Remember that the index of the first position is 1, not zero like other programming languages (C, Java, Python).
x=c(22,33,44,54,1,2,97)
x[1]
## [1] 22
Character
c("John", "Juan", "Jason")
## [1] "John" "Juan" "Jason"
Logical Vector
c(TRUE, FALSE, TRUE)
## [1] TRUE FALSE TRUE
(ii) The function ‘:’
Increasing integers -
2:10
## [1] 2 3 4 5 6 7 8 9 10
Decreasing integers -
7:3
## [1] 7 6 5 4 3
(iii) The function ‘seq’
seq(3,10,2)
## [1] 3 5 7 9
(iv) The function ‘rep’
rep(5,10)
## [1] 5 5 5 5 5 5 5 5 5 5
(v) The function ‘sample’
sample(c('H','T'),10,replace=TRUE)
## [1] "T" "T" "H" "T" "H" "T" "H" "H" "H" "T"
There are additional functions to create random vectors with binomial, Gaussian (bell curve) and other distributions.
Concatenating Vectors
You can also combine vectors generated by the above methods. The function ‘c’ merges vectors of different types.
v1=1:10
v2=c(1,2,7,11)
c(v1,v2)
## [1] 1 2 3 4 5 6 7 8 9 10 1 2 7 11
5. Functions Operating on Vectors
A number of functions do not exist on scientific calculators, because they apply only on vectors. Sum of a vector is a good example.
Name | Action |
---|---|
head | First few elements of a vector |
tail | Last few elements of a vector |
sum | Sum of elements of a number vector |
mean | Mean of elements of a number vector |
median | Median |
sd | Standard Deviation |
var | Variance |
summary | Summary statistics |
table | Counts the elements of a vector |
vec = c(1, 22,33,44,54,1,2,97, 22)
vec %>% head
## [1] 1 22 33 44 54 1
vec %>% head(1)
## [1] 1
vec %>% tail(3)
## [1] 2 97 22
vec %>% sum
## [1] 276
vec %>% mean
## [1] 30.66667
vec %>% median
## [1] 22
vec %>% sd
## [1] 31.34486
vec %>% var
## [1] 982.5
vec %>% summary
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 2.00 22.00 30.67 44.00 97.00
vec %>% table
## .
## 1 2 22 33 44 54 97
## 2 1 2 1 1 1 1
7. Data Visualization
Core R comes with a number of plotting functions, but here we cover only two - hist (to draw histogram) and plot (to draw scatterplots). For more extensive plotting tasks, we recommend the readers to learn and use the powerful ggplot package.
Name | Action |
---|---|
hist | Draw histogram |
plot | Draw scatterplot |
Using hist()
x=c(rep(1,10),rep(2,10),rep(3,10))
hist(x)
Using plot()
The plot() function can be used to draw scatter-plots. I takes two equal-sized vectors as input and draws all corresponding points from the vector as (x,y).
v1=c(1,3,8,9,12)
v2=c(3,4,5,1,2)
plot(v1,v2)