Here we show a simple example of the gains in efficiency that can be achieved by taking advantage of matrix operations rather than multiple calls to a function. For more on the topic, see this RCS module.
In this example, we compare three approaches to computing the row
sums of a large matrix. In the first (inefficient) approach, we make use
of a double for
loop. In the second (still inefficient)
approach, we make use of the function apply
. In the third
(most efficient) approach, we make use of the cartesian product operator
%*%
.
for
loop
set.seed(123) # for reproducible results
N <- 10000000 # 10^7
X <- matrix(rnorm(N),nrow=N/10,ncol=10) # a 10^6-by-10 matrix
## use of for loop
forSum <- function(X)
{
rowS <- rep(0,nrow(X))
for ( i in 1:nrow(X) )
for ( j in 1:ncol(X) )
rowS[i] <- rowS[i] + X[i,j]
return( rowS )
}
Tfor <- system.time(tmp0 <- forSum(X))
print(Tfor)
## user system elapsed
## 1.427 0.010 1.437
apply
## use of apply function
Tsum <- system.time(tmp1 <- apply(X,1,sum))
print(Tsum)
## user system elapsed
## 1.641 0.041 1.683
%*%
)
## use of matrix multiplication
In <- rep(1,10) # define a unit vector
Tprd <- system.time(tmp2 <- X %*% In)
print(Tprd)
## user system elapsed
## 0.019 0.001 0.019
## let's measure the speed-up
Tsum["sys.self"]/Tprd["sys.self"] # CPU
## sys.self
## 41
Tsum["user.self"]/Tprd["user.self"] # CPU + R/W
## user.self
## 86.36842
Tfor["user.self"]/Tprd["user.self"] # CPU + R/W
## user.self
## 75.10526
As you can see, the matrix-based sum achieves a 86-fold speed-up
relative to the apply
-based implementation, and a 75-fold
speed-up relative to the for loop
-based
implementation.]
It should be noted that there are actually native R functions for the
efficient sum of a matrix rows and columns (rowSums
and
colSums
). However, the point of this basic demonstration
holds.
The gain in efficiency is due to the fact that in the
for
loop formulation, all the operations are performed in R
(which is an interpreted language, hence inefficient), and in the
apply
formulation there are 1M function calls (1M calls of
the sum
function), which is also inefficient. In the
%*%
formulation, there is a single R function call, and all
the number crunching is performed in whatever (compiled) language the
%*%
is implemented, most likely Fortran.
In fact, notice that the gain in efficiency of the %*%
over apply
is largely lost when computing the column sum,
since there are only 10 columns, hence only ten calls of the
sum
function. The gain in efficiency with respect to the
for
loop is still there.
## use of apply function
T3 <- system.time(tmp3 <- forSum(t(X)))
print(T3)
## user system elapsed
## 0.813 0.020 0.833
## use of apply function
T4 <- system.time(tmp4 <- apply(X,2,sum))
print(T4)
## user system elapsed
## 0.108 0.034 0.144
## use of matrix multiplication
In <- rep(1,1000000) # define a unit vector
T5 <- system.time(tmp5 <- t(X) %*% In)
print(T5)
## user system elapsed
## 0.055 0.019 0.074
As noted, cognizant of this issue, R provides efficient functions to perform row and column sums.
T6 <- system.time(rowSums(X))
print(T6)
## user system elapsed
## 0.027 0.006 0.032
T7 <- system.time(colSums(X))
print(T7)
## user system elapsed
## 0.01 0.00 0.01