R has a number of routines useful for numerical analysis
the basic function for numerical integration is integrate
f <- function(x, a=2) abs(sin(a*pi*x))
curve(f, 0, 1)
integrate(f, 0, 1)
## 0.6366 with absolute error < 7e-15
the routine returns a list, usually we only want the value of the integral, so
integrate(f, 0, 1)$value
## [1] 0.6366
integrate allows us to pass additional arguments to f with the … convention:
integrate(f, 0, 1, a=1.4)$value
## [1] 0.6118
Internally integrate subdivides the interval into 100 sub-intervals of equal length. Usually this is enough, but if the function has some sharp peaks this does not work very well. One solution is to increase the number of intervals:
integrate(f, 0, 1, subdivisions=1e4, a=1.4)$value
## [1] 0.6118
This again can be trouble if the evaluation of the function takes time. In that case you might want to write your own numerical integration function:
\[ \int_a^b f(x) dx \approx \sum_{i=1}^n f(x_i^*)(x_i-x_{i-1}) \] where \(x_{i-1} \le x_i^* \le x_i\).
x <- seq(0, 1, length=500)
y <- f(x)
mid <- (y[-1]+y[-500])/2
sum(mid)*(x[2]-x[1])
## [1] 0.6366
\[ \int_a^b f(x) dx \approx \frac{x_2-x_1}6 \sum_{i=1}^{n-1} f(x_{i-1})+4f(x_i)+f(x_{i+1})) \]
sum(y[-1]+4*y[-2]+y[-3])*(x[2]-x[1])/6
## [1] 0.6366
or any other standard numerical integration formula, see for example Numerical Integration Formulas
R doesn’t have a dedicated routine for double integrals but it is easy to use the integrate function for this as well using the fact that
\[ \int \int f(x, y)d(x,y) = \int \left\{ \int f(x, y)dx \right\} dy \]
double.integral <-
function (f, low = c(0, 0), high = c(Inf, Inf))
{
integrate(function(y) {
sapply(y, function(y) {
integrate(function(x) f(x, y), low[1], high[1])$value
})
}, low[2], high[2])$value
}
f <- function(x, y) exp(2*(-x^2+x*y-y^2)/3)
double.integral(f, low=c(-Inf, -Inf))
## [1] 5.441
Is this right? Let’s check:
\[ \exp(2(-x^2+xy-y^2)/3) = \\ 2\pi\sqrt{1-(1/2)^2} \left[ \frac1{2\pi\sqrt{1-(1/2)^2}} \\ \exp\left\{ -\frac1{2(1-(1/2)^2}\left(x^2-2xy(1/2) +y^2 \right) \right\} \right] \] but the function inside the brackets is the density of a bivariate normal random variable with means (0, 0), standard deviations (1, 1) and correlation coefficient 1/2, so this integrates out to 1. Therefore the integral is
2*pi*sqrt(1-(1/2)^2)
## [1] 5.441
We saw before the D function for finding derivatives:
f.prime <- D(expression(x^3), "x")
f.prime
## 3 * x^2
x <- 3.4; eval(f.prime)
## [1] 34.68
f.prime <- D(expression(x^2*sin(2*pi*x)), "x")
f.prime
## 2 * x * sin(2 * pi * x) + x^2 * (cos(2 * pi * x) * (2 * pi))
x <- 0.4; eval(f.prime)
## [1] -0.3431
This of course works for higher order derivatives as well:
f.double.prime <- D(D(expression(x^3), "x"), "x")
f.double.prime
## 3 * (2 * x)
x <- 3.4; eval(f.double.prime)
## [1] 20.4
f.double.prime <- D(D(expression(x^2*sin(2*pi*x)), "x"), "x")
f.double.prime
## 2 * sin(2 * pi * x) + 2 * x * (cos(2 * pi * x) * (2 * pi)) +
## (2 * x * (cos(2 * pi * x) * (2 * pi)) - x^2 * (sin(2 * pi *
## x) * (2 * pi) * (2 * pi)))
x <- 0.4; eval(f.double.prime)
## [1] -10.67
Example
The Taylor polynomial of order k of a function f at a point x=a is defined by
\[ T(x;a,k)=\sum_{i=0}^k \frac{d^i f}{dx^i}(a)(x-a)^i \]
Let’s write a routine that draws the function and it’s Taylor polynomial of order k:
taylor <- function(f, a, k=1, from, to) {
expr <- parse(text=f)
# x and y coordinates of function
x <- seq(from, to, length=250)
y <- eval(expr)
t <- x
# x value of interest a
x <- a
z <- eval(expr)
taylor.coefficients <- rep(z, k+1)
# create text version of Taylor polynomial
x.text <- ifelse(a==0, "x", paste0("(x-", a,")"))
ttl <- paste0(f, " == ")
if(z==0) nosgn <- TRUE
else {
nosgn <- FALSE
ttl <- paste0(ttl, round(z, 2))
}
for(i in 1:k) {
expr <- D(expr, "x")
tmp <- eval(expr)
z <- z + eval(expr)*(t-a)^i
tmp <- round(tmp, 2)
if(tmp!=0) {
ttl <- paste0(ttl, " ",
ifelse(nosgn, "", ifelse(tmp>0, "+", "-")),
" ", ifelse(abs(tmp)==1, "", paste0(abs(tmp),"*")),
x.text,
ifelse(i==1, "", paste0("^", i)))
}
nosgn <- FALSE
}
plot(t, y, type="l", lwd=2,
main=parse(text=ttl))
lines(t, z, col="blue")
}
taylor("log(x+1)", a=0, k=1, -0.5, 0.5)
taylor("log(x+1)", a=0, k=3, -0.5, 0.5)
Here is another example
taylor("x*sin(x^2)", a=0.86, k=2, from=0, to=2)
taylor("x*sin(x^2)", a=1.4, k=2, from=0, to=2)
The D function only works on functions that R recognizes. For other cases you might have to write your own:
f <- function(x) log(gamma(x))
x <- seq(0.1, 2, length=250)
h <- (x[2]-x[1])
y <- f(x)
y.prime <- (y[-1]-y[-250])/h
mid <- (x[-1]+x[-250])/2
plot(mid, y.prime, type="l",
xlab="x", ylab="")
# Second derivative
y.2.prime <- rep(0, 248)
for(i in 2:249)
y.2.prime[i-1] <- (y[i-1]-2*y[i]+y[i+1])/h^2
plot(x[-c(1, 250)], y.2.prime, type="l",
xlab="x", ylab="")
A very common problem is to have to solve an equation of the form \(f(x)=a\). As a specific example we will consider \(x\sin(x^2)=1\). Here are some ideas:
f <- function(x) x*sin(x^2)
x <- seq(0, 1.2, length=500)
y <- f(x)
curve(f, 0, 1.2, lwd=2)
abline(h=1)
y0 <- x[which.min(abs(y-1))]
abline(v=y0)
y0
## [1] 1.085
Of course we can combine that with D to find extrema of a function:
f <- function(x) x*sin(x^2)
x <- seq(0.5, 2, length=500)
y <- f(x)
curve(f, 0.5, 2, lwd=2)
y.prime <- eval(D(expression(x*sin(x^2)), "x"))
x0 <- x[which.min(abs(y.prime))]
abline(v=x0)
c(x0, f(x0))
## [1] 1.354 1.308
An alternative is to use optimize. By default it finds minima, so
optimize(f, c(0.5, 2), maximum = TRUE)
## $maximum
## [1] 1.355
##
## $objective
## [1] 1.308
R has the function uniroot to find the roots of a univariate function:
f <- function(x) x*sin(x^2)-1
uniroot(f, c(0, 1.2))$root
## [1] 1.084
In the case of a polynomial one can also use polyroot. Say we want to find the roots of
\[ p(x)=1+x-x^2+x^4 \]
polyroot(c(1, 1, -1, 0, 1))
## [1] 0.8774+0.7449i -0.7549-0.0000i -1.0000+0.0000i 0.8774-0.7449i
this gives also the complex solutions. Often we only want the real ones:
z <- polyroot(c(1, 1, -1, 0, 1))
z <- Re(z[round(Im(z), 10)==0])
z
## [1] -0.7549 -1.0000
If we have a function of more than one variable we can use nlm. Again it finds minima, but there is no argument maximum=TRUE, so if we want a maximum we have to use -f.
Say we want to maximize
\[ f(x,y)= e^{-(x-1)^2-(y-2)^2} \] (of course it is obvious the answer is 1, 2). Now
f <- function(x) -exp(-(x[1]-1)^2-(x[2]-2)^2)
nlm(f, c(0, 0))
## $minimum
## [1] -1
##
## $estimate
## [1] 1 2
##
## $gradient
## [1] 1.521e-07 -7.605e-08
##
## $code
## [1] 1
##
## $iterations
## [1] 16
Quite useful is the ability to calculate the Hessian matrix, that is \(H = (h_{ij})\) and
\[ h_{ij}= \frac{\partial^2 f}{\partial x_i \partial x_j} \]
nlm(f, c(0, 0), hessian = TRUE)
## $minimum
## [1] -1
##
## $estimate
## [1] 1 2
##
## $gradient
## [1] 1.521e-07 -7.605e-08
##
## $hessian
## [,1] [,2]
## [1,] 2.000e+00 -1.665e-08
## [2,] -1.665e-08 2.000e+00
##
## $code
## [1] 1
##
## $iterations
## [1] 16
This works well as long as we can play a bit with the starting point but can be quite hard to do in a simulation.
An alternative is optim, which let’s us define boundaries within which the optimum is to be found. It also has a choice of methods, and sometimes one will work where the others do not.
Example
A very important object is Statistics is the log-likelihood function. Say we have observations \(x_1,..,x_n\) from some density \(f(x;\theta)\), where theta is some parameter (vector). Then the log-likelihood function is defined by
\[l(\theta) = \sum_{i=1}^n \log f(x_i;\theta)\]
A standard way to estimate \(\theta\) is using the method of maximum likelihood, which finds the maximum of the likelihood function.
Say \(X \sim \text{ Bernoulli}(\theta)\), then
\[ \begin{aligned} &f(x;\theta) = \theta^x(1-\theta)^{1-x}\\ &\log f(x;\theta) = x\log\theta+(1-x)\log(1-\theta) \\ &l(\theta) = \sum_{i=1}^n x\log\theta+(1-x)\log(1-\theta) \\ \end{aligned} \]
Here is an example of what this might look like:
x <- sample(c(0, 1), size=50,
replace=TRUE, prob = c(2, 1))
ll <- function(theta, x) {
y <- length(theta)
for(i in seq_along(theta))
y[i] <- sum(x*log(theta[i]) +
(1-x)*log(1-theta[i]))
y
}
theta <- seq(0, 1, 0.01)
plot(theta, ll(theta, x=x),
type="l",
xlab=expression(theta),
ylab="log-likelihood")
fn <- function(z) -ll(z, x=x)
# needs function to minimize
tmp <- optim(0.5, fn, lower=0.1, upper=0.9)
unlist(tmp)
## par
## "0.240000953358406"
## value
## "27.5539964044732"
## counts.function
## "8"
## counts.gradient
## "8"
## convergence
## "0"
## message
## "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
abline(v=tmp$par)
Say \(X \sim \text{ Normal}(\mu, \sigma)\), then
x <- rnorm(50, 5, 2)
ll <- function(theta, x) {
-sum(log(dnorm(x, theta[1], theta[2])))
}
tmp <- optim(c(0, 1), ll, lower=c(-Inf, 0),
upper=c(Inf, Inf), x=x)
tmp$par
## [1] 4.681 1.937