Bayesian Statistics

The frequentist perspective is that the true parameter value θ is fixed but unknown, while the point estimate θ is a random variable on account of it being a function of the dataset (which is seen as random). The Bayesian perspective on statistics is quite different. The Bayesian uses probability to reflect degrees of certainty in states of knowledge. The dataset is directly observed and so is not random. On the other hand, the true parameter θ is unknown or uncertain and thus is represented as a random variable

Prior Distribution vs Post Distribution

$$ p(θ | x^{(1)}, . . . , x^{(m)}) =\frac{p(x^{(1)}, . . . , x^{(m)}| θ)p(θ)}{p(x^{(1)}, . . . , x^{(m)})} $$

$$ p(x^{(m+1)}| x^{(1)}, . . . , x^{(m)}) = \int p(x^{(m+1)}| θ)p(θ | x^{(1)}, . . . , x^{(m)}) dθ $$

The Curse of Dimensionality

Many machine learning problems become exceedingly difficult when the number of dimensions in the data is high. This phenomenon is known as the curse of dimensionality.