4.1 Definition and attributes

To define a prior that is transformation invariant, Harold Jeffreys proposed taking the prior distribution on parameter space that is proportional to the square root of the determinant of Fisher information (Liu and Wasserman, 2014), \[\pi_J (\theta) \propto (I(\theta))^\frac{1}{2}\]

where \(I(\theta)\) is the Fisher information, as explained above:

\[I(\theta) = -E_{\theta} \left(\frac{d^2 log(f(X|\theta)}{d\theta^2}\right)\]

A few attributes are important to note here. First, a Jeffreys prior can be an improper prior. Second, Jeffreys priors are not always conjugate priors. However, they are the limits of conjugate prior densities (Jordan lecture 7). For example, a Gaussian density approaches a flat prior as \(\sigma_0 \rightarrow \infty\).