In R/RStudio you have several ways to write your own functions:
myfun <- function(x) {
out <- x^2
out
}
name <- function(variables) {
}
change the name to myfun, save the file as myfun.R with File > Save. Now type in the code. When done click the Source button.
fix(myfun)
now a window with an editor pops up and you can type in the code. When you are done click on Save. If there is some syntax error DON’T run fix again, instead run
myfun <- edit()
myfun will exist only until you close R/RStudio unless you save the project file.
source('../some.folder/myfun.R')
Which of these is best? In large part that depends on your preferences. In my case, if I expect to need that function just for a bit I use the fix option. If I expect to need that function again later I start with the first method, but likely soon open the .R file outside RStudio because most code editors have many useful features not available in RStudio.
If myfun is open in RStudio there are some useful keyboard shortcuts. If the curser is on some line in the RStudio editor you can hit
As always you can test whether an object is a function:
x <- 1
f <- function(x) x
is.function(x)
## [1] FALSE
is.function(f)
## [1] TRUE
There are several ways to specify arguments in a function:
calc.power <- function(x, y, n=2) x^n + y^n
here n has a default value, x and y do not.
if the arguments are not named they are matched in order:
calc.power(2, 3)
## [1] 13
If an argument does not have a default it can be tested for
f <- function(first, second) {
if(!missing(second))
out <- first + second
else out <- first
out
}
f(1)
## [1] 1
f(1, s=3)
## [1] 4
There is a special argument …, used to pass arguments on to other functions:
f <- function(x, which, ...) {
f1 <- function(x, mult) mult*x
f2 <- function(x, pow) x^pow
if(which==1)
out <- f1(x, ...)
else
out <- f2(x, ...)
out
}
f(1:3, 1, mult=2)
## [1] 2 4 6
f(1:3, 2, pow=3)
## [1] 1 8 27
This is one of the most useful programming structures in R!
Note this example also shows that in R functions can call other functions. In many computer programs there are so called sub-routines, in R this concept does not exist, functions are just functions.
A function can either return nothing or exactly one thing. It will automatically return the last object evaluated:
f <- function(x) {
x^2
}
f(1:3)
## [1] 1 4 9
however, it is better programming style to have an explicit return object:
f <- function(x) {
out <- x^2
out
}
f(1:3)
## [1] 1 4 9
There is another way to specify what is returned:
f <- function(x) {
return(x^2)
}
f(1:3)
## [1] 1 4 9
but this is usually used to return something early in the program:
f <- function(x) {
if(!any(is.numeric(x)))
return("Works only for numeric!")
out <- sum(x^2)
out
}
f(1:3)
## [1] 14
f(letters[1:3])
## [1] "Works only for numeric!"
If you want to return more than one item use a list:
f <- function(x) {
sq <- x^2
sm <- sum(x)
list(sq=sq, sum=sm)
}
f(1:3)
## $sq
## [1] 1 4 9
##
## $sum
## [1] 6
R has all the standard programming structures:
f <- function(x) {
if(x>0) y <- log(x)
else y <- NA
y
}
f(c(2, -2))
## [1] 0.6931472 NaN
A useful variation on the if statement is switch:
centre <- function(x, type) {
switch(type,
mean = mean(x),
median = median(x),
trimmed = mean(x, trim = .1))
}
x <- rcauchy(10)
centre(x, "mean")
## [1] 0.4507672
centre(x, "median")
## [1] -0.1362633
centre(x, "trimmed")
## [1] 0.007100938
special R construct: ifelse
x <- sample(1:10, size=7, replace = TRUE)
x
## [1] 6 1 10 9 7 4 3
ifelse(x<5, "Yes", "No")
## [1] "No" "Yes" "No" "No" "No" "Yes" "Yes"
there are two standard loops in R:
y <- rep(0, 10)
for(i in 1:10) y[i] <- i*(i+1)/2
y
## [1] 1 3 6 10 15 21 28 36 45 55
sometimes we don’t know the length of y ahead of time, then we can use
for(i in seq_along(y)) y[i] <- i*(i+1)/2
y
## [1] 1 3 6 10 15 21 28 36 45 55
If there is more than one statement inside a loop use curly braces:
for(i in seq_along(y)) {
y[i] <- i*(i+1)/2
if(y[i]>40) y[i] <- (-1)
}
y
## [1] 1 3 6 10 15 21 28 36 -1 -1
You can nest loops:
A <- matrix(0, 4, 4)
for(i in 1:4) {
for(j in 1:4)
A[i, j] <- i*j
}
A
## [,1] [,2] [,3] [,4]
## [1,] 1 2 3 4
## [2,] 2 4 6 8
## [3,] 3 6 9 12
## [4,] 4 8 12 16
k <- 0
repeat {
k <- k+1
x <- sample(1:6, size=3, replace=TRUE)
if(length(table(x))==1) break
}
k
## [1] 65
Notice that a repeat loop could in principle run forever. I usually include a counter that ensures the loop will eventually stop:
k <- 0
counter <- 0
repeat {
k <- k+1
counter <- counter+1
x <- sample(1:6, size=3, replace=TRUE)
if(length(table(x))==1 | counter>1000) break
}
k
## [1] 87