Suppose we have cosmological probes, whose likelihood function is assumed to be a multi-dimensional Gaussian, given by: (), i.e.,
The posterior distribution for the parameters from each probe, , is obtained by Bayes’ theorem as
For future reference, it is also useful to write down the general expression for the Bayesian evidence. For a normal prior and a likelihood
A general likelihood function for a future experiment (subscript ) can be Taylor-expanded around its maximum-likelihood value, . By definition, at the maximum the first derivatives vanish, and the shape of the log-likelihood in parameter space is approximated by the Hessian matrix ,
Once we have the Fisher matrix, we can give estimates for the accuracy on the parameters from a future measurement, by computing the posterior as in Eq. (5.2.2). If we are only interested in a subset of the parameters, then we can marginalize easily over the others: computing the Gaussian integral over the unwanted parameters is the same as inverting the Fisher matrix, dropping the rows and columns corresponding to those parameters (keeping only the rows and columns containing the parameters of interest) and inverting the smaller matrix back. The result is the marginalized Fisher matrix . For example, the error for parameter from experiment , marginalized over all other parameters, is simply given by .
It remains to compute the Fisher matrix for the future experiment. This can be done analytically for the case where the likelihood function is approximately Gaussian in the data, which is a good approximation for many applications of interest. We can write for the log-likelihood (in the following, we drop the subscript denoting the experiment under consideration for simplicity of notation)expectation values and not on the particular data realization means that the Fisher matrix can be computed from knowledge of the noise properties of the experiment without having to go through the step of actually generating any simulated data. The specific form of the Fisher matrix then becomes a function of the type of observable being considered and of the experimental parameters.
Explicit expressions for the Fisher matrix for cosmological observables can be found in  for cosmic microwave background data, in  for the matter power spectrum from galaxy redshift surveys (applied to baryonic acoustic oscillations in  and in  for weak lensing. These approaches have been discussed in Section 1.7. A useful summary of Fisher matrix technology is given in the Dark Energy Task Force report  and in . A useful numerical package which includes several of the a bove calculations is the publicly available Matlab code20 Fisher4Cast [99, 98]. Attempts to include systematic errors modelling in this framework can be found in [508, 878, 505].
It has become customary to describe the statistical power of a future dark energy probe by the inverse area enclosed by the 68% covariance ellipse marginalized down to the dark-energy parameter space. This measure of statistical performance for probe (widely known as the DETF FoM [21, 470]) is usually defined (up to multiplicative constants) as
A closely related but more statistically motivated measure of the information gain is the Kullback–Leibler divergence (KL) between the posterior and the prior, representing the information gain obtained when upgrading the prior to the posterior via Bayes’ theorem:
A discussion of other, alternative FoMs (D-optimality, A-optimality) can be found in . In  a different FoM for dark energy is suggested. For a set of DE parameters , the FoM is defined as , where is the covariance matrix of . This definition is more flexible since one can use it for any DE parametrization .
Given that Euclid can constrain both the expansion history and the growth of structure, it is also useful to introduce a new FoM for the growth of perturbations. Similarly to the DETF FoM, one can define this new FoM as the inverse area of the 95% error ellipse of , where is the growth index, defined starting from the growth rate , or as or similar variants [614, 308]. Instead of , other parameters describing the growth can also be employed.
A FoM targeted at evaluating the robustness of a future probe to potential systematic errors has been introduced in . The robustness of a future probe is defined via the degree of overlap between the posterior distribution from that probe and the posterior from other, existing probes. The fundamental notion is that maximising statistical power (e.g., by designing a future probe to deliver orthogonal constraints w.r.t. current probes) will in general reduce its robustness (by increasing the probability of an incompatible results, for example because of systematic bias). Thus in evaluating the strength of a probe, both its statistical power and its resilience to plausible systematics ought to be considered.
When considering the capabilities of future experiments, it is common stance to predict their performance in terms of constraints on relevant parameters, assuming a fiducial point in parameter space as the true model (often, the current best-fit model), as explained above. While this is a useful indicator for parameter inference tasks, many questions in cosmology fall rather in the model comparison category. Dark energy is a case in point, where the science driver for many future probes (including Euclid) is to detect possible departures from a cosmological constant, hence to gather evidence in favor of an evolving dark-energy model. Therefore, it is preferable to assess the capabilities of future experiments by their ability to answer model selection questions.
The procedure is as follows (see  for details and the application to dark-energy scenarios). At every point in parameter space, mock data from the future observation are generated and the Bayes factor between the competing models is computed, for example between an evolving dark energy and a cosmological constant. Then one delimits in parameter space the region where the future data would not be able to deliver a clear model comparison verdict, for example (evidence falling short of the “strong” threshold). Here, is the Bayes factor, which is formed from the ratio of the Bayesian evidences of the two models being considered:
Alternatively, we can investigate the full probability distribution for the Bayes factor from a future observation. This allows to make probabilistic statements regarding the outcome of a future model comparison, and in particular to quantify the probability that a new observation will be able to achieve a certain level of evidence for one of the models, given current knowledge. This technique is based on the predictive distribution for a future observation, which gives the expected posterior for an observation with a certain set of experimental capabilities (further details are given in ). This method is called PPOD, for predictive posterior odds distribution and can be useful in the context of experiment design and optimization
Hybrid approaches have also been attempted, i.e., to defined model-selection oriented FoMs while working in the Fisher-matrix framework, such as the expected Bayesian evidence ratio [429, 31].
The most general approach to performance forecasting involves the use of a suitably defined utility function, and it has recently been presented in . Consider the different levels of uncertainty that are relevant when predicting the probability of a certain model selection outcome from a future probe, which can be summarized as follows:
The commonly-used Fisher matrix forecast ignores the uncertainty arising from Levels 1 and 2, as it assumes a fiducial model (Level 1) and fiducial parameter values (Level 2). It averages over realization noise (Level 3) in the limit of an infinite number of realizations. Clearly, the Fisher matrix procedure provides a very limited assessment of what we can expect for the scientific return of a future probe, as it ignores the uncertainty associated with the choice of model and parameter values.
The Bayesian framework allows improvement on the usual Fisher matrix error forecast thanks to a general procedure which fully accounts for all three levels of uncertainty given above. Following , we think of future data as outcomes, which arise as consequence of our choice of experimental parameters (actions). For each action and each outcome, we define a utility function . Formally, the utility only depends on the future data realization . However, as will become clear below, the data are realized from a fiducial model and model parameter values. Therefore, the utility function implicitly depends on the assumed model and parameters from which the data are generated. The best action is the one that maximizes the expected utility, i.e., the utility averaged over possible outcomes:
where the last line follows because (conditioning on current data is irrelevant once the parameters are given) and (conditioning on future experimental parameters is irrelevant for the present-day posterior). So we can predict the probability distribution for future data by averaging the likelihood function for the future measurement (Level 3 uncertainty) over the current posterior on the parameters (Level 2 uncertainty). The expected utility then becomes
So far, we have tacitly assumed that only one model was being considered for the data. In practice, there will be several models that one is interested in testing (Level 1 uncertainty), and typically there is uncertainty over which one is best. This is in fact one of the main motivations for designing a new dark energy probe. If models are being considered, each one with parameter vector (), the current posterior can be further extended in terms of model averaging (Level 1), weighting each model by its current model posterior probability, , obtaining from Eq. (5.2.21) the model-averaged expected utility
This expected utility is the most general definition of a FoM for a future experiment characterized by experimental parameters . The usual Fisher matrix forecast is recovered as a special case of Eq. (5.2.22), as are other ad hoc FoMs that have been defined in the literature. Therefore Eq. (5.2.22) gives us a formalism to define in all generality the scientific return of a future experiment. This result clearly accounts for all three levels of uncertainty in making our predictions: the utility function (to be specified below) depends on the future data realization, , (Level 3), which in turn is a function of the fiducial parameters value, , (Level 2), and is averaged over present-day model probabilities (Level 1).
This approach is used in  to define two model-selection oriented Figures of Merit: the decisiveness , which quantifies the probability that a probe will deliver a decisive result in favor or against the cosmological constant, and the expected strength of evidence, , that returns a measure of the expected power of a probe for model selection.
Living Rev. Relativity 16, (2013), 6
This work is licensed under a Creative Commons License.