Here we show a simple example of the gains in efficiency that can be achieved by taking advantage of matrix operations rather than multiple calls to a function. For more on the topic, see this RCS module.

In this example, we compare three approaches to computing the row sums of a large matrix. In the first (inefficient) approach, we make use of a double for loop. In the second (still inefficient) approach, we make use of the function apply. In the third (most efficient) approach, we make use of the cartesian product operator %*%.

Use of for loop

set.seed(123) # for reproducible results
N <- 10000000 # 10^7
X <- matrix(rnorm(N),nrow=N/10,ncol=10) # a 10^6-by-10 matrix

## use of for loop
forSum <- function(X)
{
    rowS <- rep(0,nrow(X))
    for ( i in 1:nrow(X) )
        for ( j in 1:ncol(X) )
            rowS[i] <- rowS[i] + X[i,j]
    return( rowS )
}
Tfor <- system.time(tmp0 <- forSum(X))
print(Tfor)
##    user  system elapsed 
##   1.427   0.010   1.437

Use of apply

## use of apply function
Tsum <- system.time(tmp1 <- apply(X,1,sum))
print(Tsum)
##    user  system elapsed 
##   1.641   0.041   1.683

Use of inner product (%*%)

## use of matrix multiplication
In <- rep(1,10) # define a unit vector
Tprd <- system.time(tmp2 <- X %*% In)
print(Tprd)
##    user  system elapsed 
##   0.019   0.001   0.019
## let's measure the speed-up
Tsum["sys.self"]/Tprd["sys.self"] # CPU
## sys.self 
##       41
Tsum["user.self"]/Tprd["user.self"] # CPU + R/W
## user.self 
##  86.36842
Tfor["user.self"]/Tprd["user.self"] # CPU + R/W
## user.self 
##  75.10526

As you can see, the matrix-based sum achieves a 86-fold speed-up relative to the apply-based implementation, and a 75-fold speed-up relative to the for loop-based implementation.]

It should be noted that there are actually native R functions for the efficient sum of a matrix rows and columns (rowSums and colSums). However, the point of this basic demonstration holds.

The gain in efficiency is due to the fact that in the for loop formulation, all the operations are performed in R (which is an interpreted language, hence inefficient), and in the apply formulation there are 1M function calls (1M calls of the sum function), which is also inefficient. In the %*% formulation, there is a single R function call, and all the number crunching is performed in whatever (compiled) language the %*% is implemented, most likely Fortran.

In fact, notice that the gain in efficiency of the %*% over apply is largely lost when computing the column sum, since there are only 10 columns, hence only ten calls of the sum function. The gain in efficiency with respect to the for loop is still there.

## use of apply function
T3 <- system.time(tmp3 <- forSum(t(X)))
print(T3)
##    user  system elapsed 
##   0.813   0.020   0.833
## use of apply function
T4 <- system.time(tmp4 <- apply(X,2,sum))
print(T4)
##    user  system elapsed 
##   0.108   0.034   0.144
## use of matrix multiplication
In <- rep(1,1000000) # define a unit vector
T5 <- system.time(tmp5 <- t(X) %*% In)
print(T5)
##    user  system elapsed 
##   0.055   0.019   0.074

Native row- (and col-)wide R functions

As noted, cognizant of this issue, R provides efficient functions to perform row and column sums.

T6 <- system.time(rowSums(X))
print(T6)
##    user  system elapsed 
##   0.027   0.006   0.032
T7 <- system.time(colSums(X))
print(T7)
##    user  system elapsed 
##    0.01    0.00    0.01