Problem Given a random variable \(X \sim \mathcal{N}(\mu,\sigma^2)\), find a transformation \(f: X \rightarrow Y\), such that \(Y \sim Uniform(a,b)\). Solution Let \(\Phi_X(\cdot)\) the cumulative distribution function of \(X\). \[ \begin{eqnarray} Z \equiv \frac{X - \mu}{\sigma};\quad Z &\sim& \mathcal{N}(0;1) \\ \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(0;1) \\ (b-a) \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(0,b-a) \\ a + (b-a) \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(a,b) \end{eqnarray} \] In conclusion, \(Y \equiv a + (b-a) \Phi_X\left(\frac{X-\mu}{\sigma}\right)\). Computational demonstration norm2unif = function(x, mu = 0, sigma = 1, min = 0, max = 1, use.

Continue reading

Problem statement Given a set \(S = {s_1, s_2, \dots, s_n}\), one would like to sample a subset of \(X \subset S\) of size \(m\). If this operation needs to be repeated for a very large number of times \(k\), what is the most efficient way? set_S = c(1:100) microbenchmark::microbenchmark(sample(set_S, size = 50), times = 10) ## Unit: microseconds ## expr min lq mean median uq max neval ## sample(set_S, size = 50) 5.

Continue reading

Principal component analysis (PCA) is a popular method for dimensional reduction, and has been invented independently many times in different fields, resulting in various definitions. This note attempts to unify 2 of the most popular definitions of principal components and illustrate how PCA can be done correspondingly. Alternative definitions PCA from the SVD of the centered matrix The singular value decomposition (SVD) of an \(n\times d\) matrix \(X\) has the form

Continue reading

Sweeping along an axis can be represented by matrix multiplication. Given the matrix \(A\) and diagonal matrix \(D\), \(DA\) is equivalent to multiplying each row \(i\) of \(A\) by \(d_{ii}\), and \(AD\) is equivalent to multiplying each column \(j\) of \(A\) by \(d_{jj}\) A = matrix(runif(50000),ncol=100) w = apply(A, 1, norm, '2') all(abs(sweep(A,1, w, '/') - (diag(1/w) %*% A) ) < .Machine$double.eps) ## [1] TRUE It is reasonably expected that the sweeping operation on invidual row/column vector will be more efficient than the equivalent matrix operation, because no additional memory will be required to store the non-diagonal entries of \(D\).

Continue reading

Defining utility functions burd = colorRampPalette(colors = c("blue", "white", "red"))(n = 499) blues = colorRampPalette(colors = c('#deebf7', '#08306b'))(n = 256) plot.matrix = function(m, col = burd, asp=1) { m %>% apply(MARGIN = 2, rev) %>% t() %>% image(useRaster = TRUE, axes = FALSE, col = col, asp = asp) } parse_timing_output = function(output_raw) { sapply(output_raw, function(x) { str = stringr::str_split(x,":\\s+")[[1]] return(as.numeric(str[2])) }) } An arbitrary matrix sin2d = function(a, b) { sin((a/ 500 - b / 15) * pi) } start = proc.

Continue reading

As of this writing, Julia supports three types of concurrency: Coroutines Multi-Threading Multi-Core or Distributed Processing This post will explore multicore parallelization in Julia Using multiple cores in julia If more than one cores are to be used in julia, it must be specified, either when starting julia, using -p <n_cpus, for example julia -p 8 # to use 8 cores or by adding processors in an interactive session

Continue reading

Author's picture

Trang Tran


Student

USA