Singular value decomposition is an expensive operation. For rectangular matrices with significant different dimensions, i.e. very “fat” or “thin” matrices, there is a trick to make the computation cheaper. This trick is implemented in fast.svd() of the R package corpcor.

Calculate SVD

The singular value decomposition of a matrix \(M\) of size \(m \times n\).

\[ M = UDV^T \]

\[ \begin{align} MM^T &= (UDV^T)(UDV^T)^T \\ &= (UDV^T)V(UD)^T \\ &= UD (V^TV) (UD)^T \\ &= UD(UD)^T \quad (V\text{ is orthogonal}) \\ &= UDD^TU^T \\ \end{align} \] Thus the decomposition of \(MM^T\) gives \(U\) and \(D^2\). The positive singular values of \(M\) can be inferred from \(D^2\). For this reason, fast SVD can only return positive singular values.

In the case of fat matrix \(M\) (\(n > m\)), \(MM^T\) is of dimension \(m\times m\), which is smaller than the original \(M\).

V is then calculated by

\[ V = M^T U D^{-1} \]

How fast can it be?

dat = data.frame('m' = rep(50, 6),
                 'factor' = c(1,2,5,10,20,50),
                 'svd' = rep(0, 6),
                 'fast.svd' = rep(0, 6))
dat[,'n'] = dat[,'factor'] * dat[,'m']
dat <- lapply(1:nrow(dat), function(i) {
    x = rnorm(dat[i, 'm'] * dat[i, 'n']) %>%
        matrix(nrow = dat[i, 'm'], ncol = dat[i, 'n'])
    slow <- function() { return(svd(x)) }
    fast <- function() { return(fast.svd(x))}
    t0 = proc.time()
    tmp <- svd(x)
    t1 = proc.time()
    tmp <- fast.svd(x)
    t2 = proc.time()
    dat[i,'svd'] <- (t1 - t0)['elapsed']
    dat[i,'fast.svd'] <- (t2 - t1)['elapsed']
    return(dat[i,])
}) %>%
    do.call(rbind,.)
dat
##    m factor   svd fast.svd    n
## 1 50      1 0.011    0.001   50
## 2 50      2 0.006    0.001  100
## 3 50      5 0.003    0.009  250
## 4 50     10 0.007    0.004  500
## 5 50     20 0.008    0.008 1000
## 6 50     50 0.024    0.018 2500
dat %>%
    reshape2::melt(id.vars = c('m', 'n', 'factor')) %>%
    ggplot() +
    geom_point(aes_string(x = 'factor', y = 'value', color = 'variable'))