Suppose we have cosmological probes, whose likelihood function is assumed to be a multi-dimensional Gaussian, given by: (), i.e.,

where are the parameters one is interested in constraining, are the available data from probe and is the location of the maximum likelihood value in parameter space. The matrix is the inverse of the covariance matrix of the parameters.The posterior distribution for the parameters from each probe, , is obtained by Bayes’ theorem as

where and is the prior and is a normalizing constant (the Bayesian evidence). If we assume a Gaussian prior centered on the origin with inverse covariance matrix , the posterior from each probe is also a Gaussian, with inverse covariance matrix and posterior mean Tighter constraints on the parameters can be usually obtained by combining all available probes together (provided there are no systematics, see below). If we combine all probes together, we obtain a Gaussian posterior with inverse covariance matrix and mean Notice that the precision of the posterior (i.e., the inverse covariance matrix) does not depend on the degree of overlap of the likelihoods from the individual probes. This is a property of the Gaussian linear model.For future reference, it is also useful to write down the general expression for the Bayesian evidence. For a normal prior and a likelihood

the evidence for data is given by where is given by Eq. (5.2.5) with and .

A general likelihood function for a future experiment (subscript ) can be Taylor-expanded around its maximum-likelihood value, . By definition, at the maximum the first derivatives vanish, and the shape of the log-likelihood in parameter space is approximated by the Hessian matrix ,

where is given by and the derivatives are evaluated at the maximum-likelihood point. By taking the expectation of equation (5.2.9) with respect to many data realizations, we can replace the maximum-likelihood value with the true value, , as the maximum-likelihood estimate is unbiased (in the absence of systematics), i.e., . We then define the Fisher information matrix as the expectation value of the Hessian, The inverse of the Fisher matrix, , is an estimate of the covariance matrix for the parameters, and it describes how fast the log-likelihood falls (on average) around the maximum likelihood value, and we recover the Gaussian expression for the likelihood, Eq. (5.2.1), with the maximum likelihood value replaced by the true value of the parameters and the inverse covariance matrix given by the Fisher matrix, [496]. In general, the derivatives depend on where in parameter space we take them (except for the simple case of linear models), hence it is clear that is a function of the fiducial parameters.Once we have the Fisher matrix, we can give estimates for the accuracy on the parameters from a future measurement, by computing the posterior as in Eq. (5.2.2). If we are only interested in a subset of the parameters, then we can marginalize easily over the others: computing the Gaussian integral over the unwanted parameters is the same as inverting the Fisher matrix, dropping the rows and columns corresponding to those parameters (keeping only the rows and columns containing the parameters of interest) and inverting the smaller matrix back. The result is the marginalized Fisher matrix . For example, the error for parameter from experiment , marginalized over all other parameters, is simply given by .

It remains to compute the Fisher matrix for the future experiment. This can be done analytically for the case where the likelihood function is approximately Gaussian in the data, which is a good approximation for many applications of interest. We can write for the log-likelihood (in the following, we drop the subscript denoting the experiment under consideration for simplicity of notation)

where are the (simulated) data that would be observed by the experiment and in general both the mean and covariance matrix may depend on the parameters we are trying to estimate. The expectation value of the data corresponds to the true mean, , and similarly the expectation value of the data matrix is equal to the true covariance, . Then it can be shown (see e.g. [884]) that the Fisher matrix is given by where and the comma denotes a derivative with respect to the parameters, for example . The fact that this expression depends only on expectation values and not on the particular data realization means that the Fisher matrix can be computed from knowledge of the noise properties of the experiment without having to go through the step of actually generating any simulated data. The specific form of the Fisher matrix then becomes a function of the type of observable being considered and of the experimental parameters. Explicit expressions for the Fisher matrix for cosmological observables can be found in [884]
for cosmic microwave background data, in [880] for the matter power spectrum from galaxy
redshift surveys (applied to baryonic acoustic oscillations in [815] and in [454] for weak
lensing. These approaches have been discussed in Section 1.7. A useful summary of Fisher
matrix technology is given in the Dark Energy Task Force report [21] and in [919]. A useful
numerical package which includes several of the a bove calculations is the publicly available Matlab
code^{20}
Fisher4Cast [99, 98]. Attempts to include systematic errors modelling in this framework can be found in
[508, 878, 505].

It has become customary to describe the statistical power of a future dark energy probe by the inverse area enclosed by the 68% covariance ellipse marginalized down to the dark-energy parameter space. This measure of statistical performance for probe (widely known as the DETF FoM [21, 470]) is usually defined (up to multiplicative constants) as

where the Fisher matrix is given in Eq. (5.2.11). [21] suggested to use the inverse area of the 95% error ellipse of (where and are defined in [584], [229]). This definition was inspired by [470]. In [22] it is suggested to model as piecewise constant values of defined in many small redshift bins (). The suggestion is then to apply a principal component approach [468] in order to understand the redshifts at which each experiment has the power to constrain .A closely related but more statistically motivated measure of the information gain is the Kullback–Leibler divergence (KL) between the posterior and the prior, representing the information gain obtained when upgrading the prior to the posterior via Bayes’ theorem:

The KL divergence measures the relative entropy between the two distributions: it is a dimensionless quantity which expressed the information gain obtained via the likelihood. For the Gaussian likelihood and prior introduced above, the information gain (w.r.t. the prior ) from the combination of all probes is given by [900]A discussion of other, alternative FoMs (D-optimality, A-optimality) can be found in [96]. In [939] a different FoM for dark energy is suggested. For a set of DE parameters , the FoM is defined as , where is the covariance matrix of . This definition is more flexible since one can use it for any DE parametrization [945].

Given that Euclid can constrain both the expansion history and the growth of structure, it is also useful to introduce a new FoM for the growth of perturbations. Similarly to the DETF FoM, one can define this new FoM as the inverse area of the 95% error ellipse of , where is the growth index, defined starting from the growth rate , or as or similar variants [614, 308]. Instead of , other parameters describing the growth can also be employed.

A FoM targeted at evaluating the robustness of a future probe to potential systematic errors has been introduced in [625]. The robustness of a future probe is defined via the degree of overlap between the posterior distribution from that probe and the posterior from other, existing probes. The fundamental notion is that maximising statistical power (e.g., by designing a future probe to deliver orthogonal constraints w.r.t. current probes) will in general reduce its robustness (by increasing the probability of an incompatible results, for example because of systematic bias). Thus in evaluating the strength of a probe, both its statistical power and its resilience to plausible systematics ought to be considered.

When considering the capabilities of future experiments, it is common stance to predict their performance in terms of constraints on relevant parameters, assuming a fiducial point in parameter space as the true model (often, the current best-fit model), as explained above. While this is a useful indicator for parameter inference tasks, many questions in cosmology fall rather in the model comparison category. Dark energy is a case in point, where the science driver for many future probes (including Euclid) is to detect possible departures from a cosmological constant, hence to gather evidence in favor of an evolving dark-energy model. Therefore, it is preferable to assess the capabilities of future experiments by their ability to answer model selection questions.

The procedure is as follows (see [677] for details and the application to dark-energy scenarios). At every point in parameter space, mock data from the future observation are generated and the Bayes factor between the competing models is computed, for example between an evolving dark energy and a cosmological constant. Then one delimits in parameter space the region where the future data would not be able to deliver a clear model comparison verdict, for example (evidence falling short of the “strong” threshold). Here, is the Bayes factor, which is formed from the ratio of the Bayesian evidences of the two models being considered:

where the Bayesian evidence is the average of the likelihood under the prior in each model (denoted by a subscript ): The Bayes factor updates the prior probability ratio of the models to the posterior one, indicating the extent to which the data have modified one’s original view on the relative probabilities of the two models. The experiment with the smallest “model-confusion” volume in parameter space is to be preferred, since it achieves the highest discriminative power between models. An application of a related technique to the spectral index from the Planck satellite is presented in [704, 703].Alternatively, we can investigate the full probability distribution for the Bayes factor from a future observation. This allows to make probabilistic statements regarding the outcome of a future model comparison, and in particular to quantify the probability that a new observation will be able to achieve a certain level of evidence for one of the models, given current knowledge. This technique is based on the predictive distribution for a future observation, which gives the expected posterior for an observation with a certain set of experimental capabilities (further details are given in [895]). This method is called PPOD, for predictive posterior odds distribution and can be useful in the context of experiment design and optimization

Hybrid approaches have also been attempted, i.e., to defined model-selection oriented FoMs while working in the Fisher-matrix framework, such as the expected Bayesian evidence ratio [429, 31].

The most general approach to performance forecasting involves the use of a suitably defined utility function, and it has recently been presented in [899]. Consider the different levels of uncertainty that are relevant when predicting the probability of a certain model selection outcome from a future probe, which can be summarized as follows:

- Level 1: current uncertainty about the correct model (e.g., is it a cosmological constant or a dark-energy model?).
- Level 2: present-day uncertainty in the value of the cosmological parameters for a given model (e.g., present error on the dark-energy equation of state parameters assuming an evolving dark-energy model).
- Level 3: realization noise, which will be present in future data even when assuming a model and a fiducial choice for its parameters.

The commonly-used Fisher matrix forecast ignores the uncertainty arising from Levels 1 and 2, as it assumes a fiducial model (Level 1) and fiducial parameter values (Level 2). It averages over realization noise (Level 3) in the limit of an infinite number of realizations. Clearly, the Fisher matrix procedure provides a very limited assessment of what we can expect for the scientific return of a future probe, as it ignores the uncertainty associated with the choice of model and parameter values.

The Bayesian framework allows improvement on the usual Fisher matrix error forecast thanks to a general procedure which fully accounts for all three levels of uncertainty given above. Following [590], we think of future data as outcomes, which arise as consequence of our choice of experimental parameters (actions). For each action and each outcome, we define a utility function . Formally, the utility only depends on the future data realization . However, as will become clear below, the data are realized from a fiducial model and model parameter values. Therefore, the utility function implicitly depends on the assumed model and parameters from which the data are generated. The best action is the one that maximizes the expected utility, i.e., the utility averaged over possible outcomes:

Here, is the predictive distribution for the future data, conditional on the experimental setup () and on current data (). For a single fixed model the predictive distribution is given bywhere the last line follows because (conditioning on current data is irrelevant once the parameters are given) and (conditioning on future experimental parameters is irrelevant for the present-day posterior). So we can predict the probability distribution for future data by averaging the likelihood function for the future measurement (Level 3 uncertainty) over the current posterior on the parameters (Level 2 uncertainty). The expected utility then becomes

So far, we have tacitly assumed that only one model was being considered for the data. In practice, there will be several models that one is interested in testing (Level 1 uncertainty), and typically there is uncertainty over which one is best. This is in fact one of the main motivations for designing a new dark energy probe. If models are being considered, each one with parameter vector (), the current posterior can be further extended in terms of model averaging (Level 1), weighting each model by its current model posterior probability, , obtaining from Eq. (5.2.21) the model-averaged expected utility

This expected utility is the most general definition of a FoM for a future experiment characterized by experimental parameters . The usual Fisher matrix forecast is recovered as a special case of Eq. (5.2.22), as are other ad hoc FoMs that have been defined in the literature. Therefore Eq. (5.2.22) gives us a formalism to define in all generality the scientific return of a future experiment. This result clearly accounts for all three levels of uncertainty in making our predictions: the utility function (to be specified below) depends on the future data realization, , (Level 3), which in turn is a function of the fiducial parameters value, , (Level 2), and is averaged over present-day model probabilities (Level 1).

This approach is used in [899] to define two model-selection oriented Figures of Merit: the decisiveness , which quantifies the probability that a probe will deliver a decisive result in favor or against the cosmological constant, and the expected strength of evidence, , that returns a measure of the expected power of a probe for model selection.

Living Rev. Relativity 16, (2013), 6
http://www.livingreviews.org/lrr-2013-6 |
This work is licensed under a Creative Commons License. E-mail us: |