Sweep vs Matrix multiplication
Sweeping along an axis can be represented by matrix multiplication. Given the matrix \(A\) and diagonal matrix \(D\),
- \(DA\) is equivalent to multiplying each row \(i\) of \(A\) by \(d_{ii}\), and
- \(AD\) is equivalent to multiplying each column \(j\) of \(A\) by \(d_{jj}\)
A = matrix(runif(50000),ncol=100)
w = apply(A, 1, norm, '2')
all(abs(sweep(A,1, w, '/') - (diag(1/w) %*% A) ) < .Machine$double.eps)
## [1] TRUE
It is reasonably expected that the sweeping operation on invidual row/column vector will be more efficient than the equivalent matrix operation, because no additional memory will be required to store the non-diagonal entries of \(D\). The mini experiments below show how these two approaches scale with the number of rows/columns.
Sweeping along the rows
n = 100
row_sweeping = lapply(1:20*50, function(m) {
A = matrix(runif(m*n), nrow = m)
w = apply(A, 1, norm, '2')
bm = microbenchmark(sweep(A, 1, w, '/'),
diag(1/w) %*% A, times = 10)
bm$m = m
return(bm)
}) %>%
do.call(rbind, .)
ggplot(row_sweeping) +
facet_grid(. ~ expr) +
geom_boxplot(aes(x=m,y=time,group=m,color=expr)) +
ylab('time (nanoseconds)')
Sweeping along the columns
m = 50
col_sweeping = lapply(1:20*50, function(n) {
A = matrix(runif(m*n), nrow = m)
w = apply(A, 2, norm, '2')
bm = microbenchmark(sweep(A, 2, w, '/'),
A %*% diag(1/w), times = 10)
bm$n = n
return(bm)
}) %>%
do.call(rbind, .)
ggplot(col_sweeping) +
facet_grid(. ~ expr) +
geom_boxplot(aes(x=n,y=time,group=n,color=expr)) +
ylab('time (nanoseconds)')