Problem Given a random variable \(X \sim \mathcal{N}(\mu,\sigma^2)\), find a transformation \(f: X \rightarrow Y\), such that \(Y \sim Uniform(a,b)\). Solution Let \(\Phi_X(\cdot)\) the cumulative distribution function of \(X\). \[ \begin{eqnarray} Z \equiv \frac{X - \mu}{\sigma};\quad Z &\sim& \mathcal{N}(0;1) \\ \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(0;1) \\ (b-a) \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(0,b-a) \\ a + (b-a) \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(a,b) \end{eqnarray} \] In conclusion, \(Y \equiv a + (b-a) \Phi_X\left(\frac{X-\mu}{\sigma}\right)\). Computational demonstration norm2unif = function(x, mu = 0, sigma = 1, min = 0, max = 1, use.

Continue reading

Problem statement Given a set \(S = {s_1, s_2, \dots, s_n}\), one would like to sample a subset of \(X \subset S\) of size \(m\). If this operation needs to be repeated for a very large number of times \(k\), what is the most efficient way? set_S = c(1:100) microbenchmark::microbenchmark(sample(set_S, size = 50), times = 10) ## Unit: microseconds ## expr min lq mean median uq max neval ## sample(set_S, size = 50) 5.

Continue reading

Sweeping along an axis can be represented by matrix multiplication. Given the matrix \(A\) and diagonal matrix \(D\), \(DA\) is equivalent to multiplying each row \(i\) of \(A\) by \(d_{ii}\), and \(AD\) is equivalent to multiplying each column \(j\) of \(A\) by \(d_{jj}\) A = matrix(runif(50000),ncol=100) w = apply(A, 1, norm, '2') all(abs(sweep(A,1, w, '/') - (diag(1/w) %*% A) ) < .Machine$double.eps) ## [1] TRUE It is reasonably expected that the sweeping operation on invidual row/column vector will be more efficient than the equivalent matrix operation, because no additional memory will be required to store the non-diagonal entries of \(D\).

Continue reading

Defining utility functions burd = colorRampPalette(colors = c("blue", "white", "red"))(n = 499) blues = colorRampPalette(colors = c('#deebf7', '#08306b'))(n = 256) plot.matrix = function(m, col = burd, asp=1) { m %>% apply(MARGIN = 2, rev) %>% t() %>% image(useRaster = TRUE, axes = FALSE, col = col, asp = asp) } parse_timing_output = function(output_raw) { sapply(output_raw, function(x) { str = stringr::str_split(x,":\\s+")[[1]] return(as.numeric(str[2])) }) } An arbitrary matrix sin2d = function(a, b) { sin((a/ 500 - b / 15) * pi) } start = proc.

Continue reading

This article will walk you through a step-by-step implementation of affinity propagation, a clustering algorithm by message passing by Frey and Dueck [@Frey:2007:Clustering]. Step-by-step Input data Given a similarity matrix S = rbind(c(1.0, 0.8, 0.7, 0.2, 0.5), c(0.8, 1.0, 0.75, 0.3, 0.3), c(0.7, 0.75, 1., -0.1, 0.4), c(0.2, 0.3, -0.1, 1.0, 0.8), c(0.5, 0.3, 0.4, 0.8, 1.0)) %>% set_colnames(c('A', 'B', 'C', 'D', 'E')) %>% set_rownames(c('A', 'B', 'C', 'D', 'E')) image(S, col = cm.

Continue reading

There has been a guide on how to set up Nginx as a reverse proxy for Rstudio server here. This guide attempts to go further, by making sure that Rstudio server is accessible via https. This guide was tested on Ubuntu 16.04 LTS and Ubuntu 20.04, so make sure you adapt the commands accordingly to your system. Assuming that your machine already has Nginx and Rstudio server up and running. After any change in the configuration, you may restart the servers using these commands.

Continue reading

Fast SVD

Singular value decomposition is an expensive operation. For rectangular matrices with significant different dimensions, i.e. very “fat” or “thin” matrices, there is a trick to make the computation cheaper. This trick is implemented in fast.svd() of the R package corpcor. Calculate SVD The singular value decomposition of a matrix \(M\) of size \(m \times n\). \[ M = UDV^T \] \[ \begin{align} MM^T &= (UDV^T)(UDV^T)^T \\ &= (UDV^T)V(UD)^T \\ &= UD (V^TV) (UD)^T \\ &= UD(UD)^T \quad (V\text{ is orthogonal}) \\ &= UDD^TU^T \\ \end{align} \] Thus the decomposition of \(MM^T\) gives \(U\) and \(D^2\).

Continue reading

Author's picture

Trang Tran


Student

USA