Problem Given a random variable \(X \sim \mathcal{N}(\mu,\sigma^2)\), find a transformation \(f: X \rightarrow Y\), such that \(Y \sim Uniform(a,b)\).
Solution Let \(\Phi_X(\cdot)\) the cumulative distribution function of \(X\).
\[ \begin{eqnarray} Z \equiv \frac{X - \mu}{\sigma};\quad Z &\sim& \mathcal{N}(0;1) \\ \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(0;1) \\ (b-a) \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(0,b-a) \\ a + (b-a) \Phi_{Z}\left(\frac{X-\mu}{\sigma}\right) &\sim& Uniform(a,b) \end{eqnarray} \]
In conclusion, \(Y \equiv a + (b-a) \Phi_X\left(\frac{X-\mu}{\sigma}\right)\).
Computational demonstration norm2unif = function(x, mu = 0, sigma = 1, min = 0, max = 1, use.
Sources https://www.linuxjournal.com/content/mark-text-vs-typora-best-markdown-editor-linux
Problem statement Given a set \(S = {s_1, s_2, \dots, s_n}\), one would like to sample a subset of \(X \subset S\) of size \(m\). If this operation needs to be repeated for a very large number of times \(k\), what is the most efficient way?
set_S = c(1:100) microbenchmark::microbenchmark(sample(set_S, size = 50), times = 10) ## Unit: microseconds ## expr min lq mean median uq max neval ## sample(set_S, size = 50) 5.
Principal component analysis (PCA) is a popular method for dimensional reduction, and has been invented independently many times in different fields, resulting in various definitions. This note attempts to unify 2 of the most popular definitions of principal components and illustrate how PCA can be done correspondingly.
Alternative definitions PCA from the SVD of the centered matrix The singular value decomposition (SVD) of an \(n\times d\) matrix \(X\) has the form
Sweeping along an axis can be represented by matrix multiplication. Given the matrix \(A\) and diagonal matrix \(D\),
\(DA\) is equivalent to multiplying each row \(i\) of \(A\) by \(d_{ii}\), and \(AD\) is equivalent to multiplying each column \(j\) of \(A\) by \(d_{jj}\) A = matrix(runif(50000),ncol=100) w = apply(A, 1, norm, '2') all(abs(sweep(A,1, w, '/') - (diag(1/w) %*% A) ) < .Machine$double.eps) ## [1] TRUE It is reasonably expected that the sweeping operation on invidual row/column vector will be more efficient than the equivalent matrix operation, because no additional memory will be required to store the non-diagonal entries of \(D\).
Defining utility functions burd = colorRampPalette(colors = c("blue", "white", "red"))(n = 499) blues = colorRampPalette(colors = c('#deebf7', '#08306b'))(n = 256) plot.matrix = function(m, col = burd, asp=1) { m %>% apply(MARGIN = 2, rev) %>% t() %>% image(useRaster = TRUE, axes = FALSE, col = col, asp = asp) } parse_timing_output = function(output_raw) { sapply(output_raw, function(x) { str = stringr::str_split(x,":\\s+")[[1]] return(as.numeric(str[2])) }) } An arbitrary matrix sin2d = function(a, b) { sin((a/ 500 - b / 15) * pi) } start = proc.
As of this writing, Julia supports three types of concurrency:
Coroutines Multi-Threading Multi-Core or Distributed Processing This post will explore multicore parallelization in Julia
Using multiple cores in julia If more than one cores are to be used in julia, it must be specified, either when starting julia, using -p <n_cpus, for example
julia -p 8 # to use 8 cores or by adding processors in an interactive session