"Massive Gravity"
Claudia de Rham 
1 Introduction
2 Massive and Interacting Fields
2.1 Proca field
2.2 Spin-2 field
2.3 From linearized diffeomorphism to full diffeomorphism invariance
2.4 Non-linear Stückelberg decomposition
2.5 Boulware–Deser ghost
I Massive Gravity from Extra Dimensions
3 Higher-Dimensional Scenarios
4 The Dvali–Gabadadze–Porrati Model
4.1 Gravity induced on a brane
4.2 Brane-bending mode
4.3 Phenomenology of DGP
4.4 Self-acceleration branch
4.5 Degravitation
5 Deconstruction
5.1 Formalism
5.2 Ghost-free massive gravity
5.3 Multi-gravity
5.4 Bi-gravity
5.5 Coupling to matter
5.6 No new kinetic interactions
II Ghost-free Massive Gravity
6 Massive, Bi- and Multi-Gravity Formulation: A Summary
7 Evading the BD Ghost in Massive Gravity
7.1 ADM formulation
7.2 Absence of ghost in the Stückelberg language
7.3 Absence of ghost in the vielbein formulation
7.4 Absence of ghosts in multi-gravity
8 Decoupling Limits
8.1 Scaling versus decoupling
8.2 Massive gravity as a decoupling limit of bi-gravity
8.3 Decoupling limit of massive gravity
8.4 Λ3-decoupling limit of bi-gravity
9 Extensions of Ghost-free Massive Gravity
9.1 Mass-varying
9.2 Quasi-dilaton
9.3 Partially massless
10 Massive Gravity Field Theory
10.1 Vainshtein mechanism
10.2 Validity of the EFT
10.3 Non-renormalization
10.4 Quantum corrections beyond the decoupling limit
10.5 Strong coupling scale vs cutoff
10.6 Superluminalities and (a)causality
10.7 Galileon duality
III Phenomenological Aspects of Ghost-free Massive Gravity
11 Phenomenology
11.1 Gravitational waves
11.2 Solar system
11.3 Lensing
11.4 Pulsars
11.5 Black holes
12 Cosmology
12.1 Cosmology in the decoupling limit
12.2 FLRW solutions in the full theory
12.3 Inhomogenous/anisotropic cosmological solutions
12.4 Massive gravity on FLRW and bi-gravity
12.5 Other proposals for cosmological solutions
IV Other Theories of Massive Gravity
13 New Massive Gravity
13.1 Formulation
13.2 Absence of Boulware–Deser ghost
13.3 Decoupling limit of new massive gravity
13.4 Connection with bi-gravity
13.5 3D massive gravity extensions
13.6 Other 3D theories
13.7 Black holes and other exact solutions
13.8 New massive gravity holography
13.9 Zwei-dreibein gravity
14 Lorentz-Violating Massive Gravity
14.1 SO(3)-invariant mass terms
14.2 Phase m1 = 0
14.3 General massive gravity (m0 = 0)
15 Non-local massive gravity
16 Outlook

2 Massive and Interacting Fields

2.1 Proca field

2.1.1 Maxwell kinetic term

Before jumping into the subtleties of massive spin-2 field and gravity in general, we start this review with massless and massive spin-1 fields as a warm up. Consider a Lorentz vector field Aμ living on a four-dimensional Minkowski manifold. We focus this discussion to four dimensions and the extension to d dimensions is straightforward. Restricting ourselves to Lorentz invariant and local actions for now, the kinetic term can be decomposed into three possible contributions:

ℒspkiinn−1= a1ℒ1 + a2ℒ2 + a3ℒ3, (2.1 )
where a1,2,3 are so far arbitrary dimensionless coefficients and the possible kinetic terms are given by
ℒ = ∂ A ν∂μA (2.2 ) 1 μ ν ℒ2 = ∂μA μ∂νA ν (2.3 ) ℒ = ∂ A ν∂ A μ, (2.4 ) 3 μ ν
where in this section, indices are raised and lowered with respect to the flat Minkowski metric. The first and third contributions are equivalent up to a boundary term, so we set a3 = 0 without loss of generality.

We now proceed to establish the behavior of the different degrees of freedom (dofs) present in this theory. A priori, a Lorentz vector field Aμ in four dimensions could have up to four dofs, which we can split as a transverse contribution A ⊥μ satisfying ∂μA ⊥μ = 0 bearing a priori three dofs and a longitudinal mode χ with Aμ = A⊥ + ∂μ χ μ.

Helicity-0 mode

Focusing on the longitudinal (or helicity-0) mode χ, the kinetic term takes the form

χ μ ν 2 ℒ kin = (a1 + a2)∂μ∂νχ∂ ∂ χ = (a1 + a2)(□ χ) , (2.5 )
where □ = η μν∂μ∂ν represents the d’Alembertian in flat Minkowski space and the second equality holds after integrations by parts. We directly see that unless a = − a 1 2, the kinetic term for the field χ bears higher time (and space) derivatives. As a well known consequence of Ostrogradsky’s theorem [421], two dofs are actually hidden in χ with an opposite sign kinetic term. This can be seen by expressing the propagator □−2 as the sum of two propagators with opposite signs:
( ) -1- = lim --1- ---1----− ---1---- , (2.6 ) □2 m→0 2m2 □ − m2 □ + m2
signaling that one of the modes always couples the wrong way to external sources. The mass m of this mode is arbitrarily low which implies that the theory (2.1*) with a3 = 0 and a1 + a2 ⁄= 0 is always sick. Alternatively, one can see the appearance of the Ostrogradsky instability by introducing a Lagrange multiplier &tidle;χ(x ), so that the kinetic action (2.5*) for χ is equivalent to
χ ( 1 ) ℒ kin = (a1 + a2) &tidle;χ□ χ − --&tidle;χ2 , (2.7 ) 4
after integrating out the Lagrange multiplier2 χ&tidle;≡ 2□ χ. We can now perform the change of variables χ = ϕ1 + ϕ2 and &tidle;χ = ϕ1 − ϕ2 giving the resulting Lagrangian for the two scalar fields ϕ 1,2
( 1 ) ℒ χkin = (a1 + a2) ϕ1□ ϕ1 − ϕ2□ ϕ2 − -(ϕ1 − ϕ2)2 . (2.8 ) 4
As a result, the two scalar fields ϕ 1,2 always enter with opposite kinetic terms, signaling that one of them is always a ghost.3 The only way to prevent this generic pathology is to make the specific choice a1 + a2 = 0, which corresponds to the well-known Maxwell kinetic term.

Helicity-1 mode and gauge symmetry

Now that the form of the local and covariant kinetic term has been uniquely established by the requirement that no ghost rides on top of the helicity-0 mode, we focus on the remaining transverse mode ⊥ A μ,

( )2 ℒhekilnicity−1 = a1 ∂ μA⊥ν , (2.9 )
which has the correct normalization if a1 = − 1∕2. As a result, the only possible local kinetic term for a spin-1 field is the Maxwell one:
ℒspin− 1= − 1-F 2 (2.10 ) kin 4 μν
with F μν = ∂μA ν − ∂ νAμ. Restricting ourselves to a massless spin-1 field (with no potential and other interactions), the resulting Maxwell theory satisfies the following U (1) gauge symmetry:
A μ → A μ + ∂μξ. (2.11 )
This gauge symmetry projects out two of the naive four degrees of freedom. This can be seen at the level of the Lagrangian directly, where the gauge symmetry (2.11*) allows us to fix the gauge of our choice. For convenience, we perform a (3 + 1)-split and choose Coulomb gauge ∂iAi = 0, so that only two dofs are present in Ai, i.e., Ai contains no longitudinal mode, t l Ai = A i + ∂iA, with i t ∂ A i = 0 and the Coulomb gauge sets the longitudinal mode l A = 0. The time-component A0 does not exhibit a kinetic term,
ℒspkiinn−1= 1(∂tAi)2 − 1-(∂iA0 )2 − 1(∂iAj)2, (2.12 ) 2 2 4
and appears instead as a Lagrange multiplier imposing the constraint
∂i∂iA0 ≡ 0. (2.13 )
The Maxwell action has therefore only two propagating dofs in At i,
spin−1 1- t 2 ℒkin = − 2 (∂ μAi) . (2.14 )
To summarize, the Maxwell kinetic term for a vector field and the fact that a massless vector field in four dimensions only propagates 2 dofs is not a choice but has been imposed upon us by the requirement that no ghost rides along with the helicity-0 mode. The resulting theory is enriched by a U (1) gauge symmetry which in turn freezes the helicity-0 mode when no mass term is present. We now ‘promote’ the theory to a massive vector field.

2.1.2 Proca mass term

Starting with the Maxwell action, we consider a covariant mass term A μA μ corresponding to the Proca action

1- 2 1- 2 μ ℒProca = − 4 Fμν − 2m A μA , (2.15 )
and emphasize that the presence of a mass term does not change the fact that the kinetic has been uniquely fixed by the requirement of the absence of ghost. An immediate consequence of the Proca mass term is the breaking of the U(1) gauge symmetry (2.11*), so that the Coulomb gauge can no longer be chosen and the longitudinal mode is now dynamical. To see this, let us use the previous decomposition A μ = A⊥ + ∂μˆχ μ and notice that the mass term now introduces a kinetic term for the helicity-0 mode χ = m χˆ,
1- ⊥ 2 1- 2 ⊥ 2 1- 2 ℒProca = − 2(∂μA ν ) − 2m (A μ) − 2(∂μχ ) . (2.16 )
A massive vector field thus propagates three dofs, namely two in the transverse modes ⊥ A μ and one in the longitudinal mode χ. Physically, this can be understood by the fact that a massive vector field does not propagate along the light-cone, and the fluctuations along the line of propagation correspond to an additional physical dof.

Before moving to the Abelian Higgs mechanism, which provides a dynamical way to give a mass to bosons, we first comment on the discontinuity in number of dofs between the massive and massless case. When considering the Proca action (2.16*) with the properly normalized fields ⊥ A μ and χ, one does not recover the massless Maxwell action (2.9*) or (2.10*) when sending the boson mass m → 0. A priori, this seems to signal the presence of a discontinuity which would allow us to distinguish between for instance a massless photon and a massive one no matter how tiny the mass. In practice, however, the difference is physically indistinguishable so long as the photon couples to external sources in a way which respects the U (1) symmetry. Note however that quantum anomalies remain sensitive to the mass of the field so the discontinuity is still present at this level, see Refs. [197, 204].

To physically tell the difference between a massless vector field and a massive one with tiny mass, one has to probe the system, or in other words include interactions with external sources

ℒsources = − A μJμ. (2.17 )
The U (1 ) symmetry present in the massless case is preserved only if the external sources are conserved, ∂μJ μ = 0. Such a source produces a vector field which satisfies
⊥ □A μ = Jμ (2.18 )
in the massless case. The exchange amplitude between two conserved sources Jμ and J′μ mediated by a massless vector field is given by
∫ ∫ 1 𝒜maJsJs′less = d4A ⊥μJ ′μ = d4xJ ′μ--Jμ. (2.19 ) □
On the other hand, if the vector field is massive, its response to the source Jμ is instead
2 ⊥ (□ − m )A μ = J μ and □ χ = 0. (2.20 )
In that case, one needs to consider both the transverse and the longitudinal modes of the vector field in the exchange amplitude between the two sources Jμ and ′ Jμ. Fortunately, a conserved source does not excite the longitudinal mode and the exchange amplitude is uniquely given by the transverse mode,
∫ ∫ massive 4 ( ⊥ ) ′μ 4 ′μ 1 𝒜 JJ′ = d x Aμ + ∂ μχ J = d xJ -------2Jμ. (2.21 ) □ − m
As a result, the exchange amplitude between two conserved sources is the same in the limit m → 0 no matter whether the vector field is intrinsically massive and propagates 3 dofs or if it is massless and only propagates 2 modes. It is, therefore, impossible to probe the difference between an exactly massive vector field and a massive one with arbitrarily small mass.

Notice that in the massive case no U (1 ) symmetry is present and the source needs not be conserved. However, the previous argument remains unchanged so long as ∂μJ μ goes to zero in the massless limit at least as quickly as the mass itself. If this condition is violated, then the helicity-0 mode ought to be included in the exchange amplitude (2.21*). In parallel, in the massless case the non-conserved source provides a new kinetic term for the longitudinal mode which then becomes dynamical.

2.1.3 Abelian Higgs mechanism for electromagnetism

Associated with the absence of an intrinsic discontinuity in the massless limit is the existence of a Higgs mechanism for the vector field whereby the vector field acquires a mass dynamically. As we shall see later, the situation is different for gravity where no equivalent dynamical Higgs mechanism has been discovered to date. Nevertheless, the tools used to describe the Abelian Higgs mechanism and in particular the introduction of a Stückelberg field will prove useful in the gravitational case as well.

To describe the Abelian Higgs mechanism, we start with a vector field A μ with associated Maxwell tensor Fμν and a complex scalar field ϕ with quartic potential

1 1 ( ) ℒAH = − -Fμ2ν − --(𝒟 μϕ)(𝒟 μϕ)∗ − λ ϕϕ∗ − Φ20 2. (2.22 ) 4 2
The covariant derivative, 𝒟 μ = ∂μ − iqA μ ensures the existence of the U (1) symmetry, which in addition to (2.11*) shifts the scalar field as
ϕ → ϕeiqξ. (2.23 )
Splitting the complex scalar field ϕ into its norm and phase iχ ϕ = φe, we see that the covariant derivative plays the role of the mass term for the vector field, when scalar field acquires a non-vanishing vacuum expectation value (vev),
1 1 1 ( ) ℒAH = − -F 2μν − -φ2 (qA μ − ∂μχ )2 − -(∂μφ)2 − λ φ2 − Φ20 2. (2.24 ) 4 2 2
The Higgs field φ can be made arbitrarily massive by setting λ ≫ 1 in such a way that its dynamics may be neglected and the field can be treated as frozen at φ ≡ Φ0 = const. The resulting theory is that of a massive vector field,
ℒ = − 1-F 2 − 1-Φ2(qA − ∂ χ )2 , (2.25 ) AH 4 μν 2 0 μ μ
where the phase χ of the complex scalar field plays the role of a Stückelberg which restores the U (1) gauge symmetry in the massive case,
Aμ → A μ + ∂ μξ(x) (2.26 ) χ → χ + qξ(x). (2.27 )
In this formalism, the U (1) gauge symmetry is restored at the price of introducing explicitly a Stückelberg field which transforms in such a way so as to make the mass term invariant. The symmetry ensures that the vector field A μ propagates only 2 dofs, while the Stückelberg χ propagates the third dof. While no equivalent to the Higgs mechanism exists for gravity, the same Stückelberg trick to restore the symmetry can be used in that case. Since the in that context the symmetry broken is coordinate transformation invariance, (full diffeomorphism invariance or covariance), four Stückelberg fields should in principle be included in the context of massive gravity, as we shall see below.

2.1.4 Interacting spin-1 fields

Now that we have introduced the notion of a massless and a massive spin-1 field, let us look at N interacting spin-1 fields. We start with N free and massless gauge fields, A(μa), with a = 1,⋅⋅⋅,N, and respective Maxwell tensors (a) (a) (a) F μν = ∂μA − ∂νA μ,

∑N ( ) ℒNspin−1= − 1- F (a) 2 . (2.28 ) kin 4 a=1 μν
The theory is then manifestly Abelian and invariant under N copies of U (1), (i.e., the symmetry group is U (1)N which is Abelian as opposed to U(N ) which would correspond to a Yang–Mills theory and would not be Abelian).

However, in addition to these N gauge invariances, the kinetic term is invariant under global rotations in field space,

(a) &tidle;(a) a (b) A μ −→ Aμ = O bA μ , (2.29 )
where Oa b is a (global) rotation matrix. Now let us consider some interactions between these different fields. At the linear level (quadratic level in the action), the most general set of interactions is
1 ∑ ℒint = −-- ℐabA(μa)A (νb)ημν, (2.30 ) 2 a,b
where ℐab is an arbitrary symmetric matrix with constant coefficients. For an arbitrary rank-N matrix, all N copies of U(1) are broken, and the theory then propagates N additional helicity-0 modes, for a total of 3N independent polarizations in four spacetime dimensions. However, if the rank r of ℐ is r < N, i.e., if some of the eigenvalues of ℐ vanish, then there are N − r special directions in field space which receive no interactions, and the theory thus keeps N − r independent copies of U (1). The theory then propagates r massive spin-1 fields and N − r massless spin-2 fields, for a total of 3N − r independent polarizations in four dimensions.

We can see this statement more explicitly in the case of N spin-1 fields by diagonalizing the mass matrix ℐ. A mentioned previously, the kinetic term is invariant under field space rotations, (2.29*), so one can use this freedom to work in a field representation where the mass matrix I is diagonal,

( 2 2) ℐab = diag m 1,⋅⋅⋅,m N . (2.31 )
In this representation the gauge fields are the mass eigenstates and the mass spectrum is simply given by the eigenvalues of ℐab.

2.2 Spin-2 field

As we have seen in the case of a vector field, as long as it is local and Lorentz-invariant, the kinetic term is uniquely fixed by the requirement that no ghost be present. Moving now to a spin-2 field, the same argument applies exactly and the Einstein–Hilbert term appears naturally as the unique kinetic term free of any ghost-like instability. This is possible thanks to a symmetry which projects out all unwanted dofs, namely diffeomorphism invariance (linear diffs at the linearized level, and non-linear diffs/general covariance at the non-linear level).

2.2.1 Einstein–Hilbert kinetic term

We consider a symmetric Lorentz tensor field hμν. The kinetic term can be decomposed into four possible local contributions (assuming Lorentz invariance and ignoring terms which are equivalent upon integration by parts):

( ) ℒspin−2= 1∂ αhμν b1∂ αhμν + 2b2∂(μh ν)α + b3∂αh ημν + 2b4∂(μh ην)α , (2.32 ) kin 2
where b1,2,3,4 are dimensionless coefficients which are to be determined in the same way as for the vector field. We split the 10 components of the symmetric tensor field h μν into a transverse tensor T hμν (which carries 6 components) and a vector field χμ (which carries 4 components),
hμν = hT + 2∂ (μχ ν). (2.33 ) μν
Just as in the case of the spin-1 field, an arbitrary kinetic term of the form (2.32*) with untuned coefficients bi would contain higher derivatives for χμ which in turn would imply a ghost. As we shall see below, avoiding a ghost within the kinetic term automatically leads to gauge-invariance. After substitution of h μν in terms of T hμν and χμ, the potentially dangerous parts are
spin−2 μ 2 μ ν ℒkin ⊃ (b1 + b2)(χ □ χμ + (b1 + 3b2 + 2b3 + 4b4)χ □ ∂μ∂νχ (2.34 ) − 2hT μν (b2 + b4)∂μ∂ν∂ αχα + (b1 + b2)∂μ□ χ μ α ) + (b3 + b4)□∂α χ ημν .
Preventing these higher derivative terms from arising sets
b4 = − b3 = − b2 = b1, (2.35 )
or in other words, the unique (local and Lorentz-invariant) kinetic term one can write for a spin-2 field is the Einstein–Hilbert term
spin−2 1-μν ˆαβ 1- Tμν ˆαβ T ℒ kin = − 4h ℰ μν hαβ = − 4 h ℰμν hαβ, (2.36 )
where ˆ ℰ is the Lichnerowicz operator
ˆαβ 1-( α αβ ) ℰμν hαβ = − 2 □h μν − 2∂(μ ∂αhν) + ∂ μ∂νh − ημν(□h − ∂α∂βh ) , (2.37 )
and we have set b1 = − 1∕4 to follow standard conventions. As a result, the kinetic term for the tensor field hμν is invariant under the following gauge transformation,
hμν → h μν + ∂ (μξν). (2.38 )
We emphasize that the form of the kinetic term and its gauge invariance is independent on whether or not the tensor field has a mass, (as long as we restrict ourselves to a local and Lorentz-invariant kinetic term). However, just as in the case of a massive vector field, this gauge invariance cannot be maintained by a mass term or any other self-interacting potential. So only in the massless case, does this symmetry remain exact. Out of the 10 components of a tensor field, the gauge symmetry removes 2 × 4 = 8 of them, leaving a massless tensor field with only two propagating dofs as is well known from the propagation of gravitational waves in four dimensions.

In d ≥ 3 spacetime dimensions, gravitational waves have d(d + 1)∕2 − 2d = d(d − 3)∕2 independent polarizations. This means that in three dimensions there are no gravitational waves and in five dimensions they have five independent polarizations.

2.2.2 Fierz–Pauli mass term

As seen in seen in Section 2.2.1, for a local and Lorentz-invariant theory, the linearized kinetic term is uniquely fixed by the requirement that longitudinal modes propagate no ghost, which in turn prevents that operator from exciting these modes altogether. Just as in the case of a massive spin-1 field, we shall see in what follows that the longitudinal modes can nevertheless be excited when including a mass term. In what follows we restrict ourselves to linear considerations and spare any non-linearity discussions for Parts I and II. See also [327] for an analysis of the linearized Fierz–Pauli theory using Bardeen variables.

In the case of a spin-2 field hμν, we are a priori free to choose between two possible mass terms 2 hμν and 2 h, so that the generic mass term can be written as a combination of both,

1- 2 ( 2 2) ℒmass = − 8m hμν − Ah , (2.39 )
where A is a dimensionless parameter. Just as in the case of the kinetic term, the stability of the theory constrains very strongly the phase space and we shall see that only for α = 1 is the theory stable at that order. The presence of this mass term breaks diffeomorphism invariance. Restoring it requires the introduction of four Stückelberg fields χ μ which transform under linear diffeomorphisms in such a way as to make the mass term invariant, just as in the Abelian-Higgs mechanism for electromagnetism [174]. Including the four linearized Stückelberg fields, the resulting mass term
1 ( ) ℒmass = − -m2 (hμν + 2∂(μχ ν))2 − A(h + 2∂ αχα)2 , (2.40 ) 8
is invariant under the simultaneous transformations:
h → h + ∂ ξ , (2.41 ) μν μν (μ ν) χ → χ − 1-ξ . (2.42 ) μ μ 2 μ
This mass term then provides a kinetic term for the Stückelberg fields
χ 1 2( 2 α 2) ℒkin = − -m (∂μ χν) − A(∂αχ ) , (2.43 ) 2
which is precisely of the same form as the kinetic term considered for a spin-1 field (2.1*) in Section 2.1.1 with a3 = 0 and a2 = Aa1. Now the same logic as in Section 2.1.1 applies and singling out the longitudinal component of these Stückelberg fields it follows that the only combination which does not involve higher derivatives is a = a 2 1 or in other words A = 1. As a result, the only possible mass term one can consider which is free from an Ostrogradsky instability is the Fierz–Pauli mass term
ℒ = − 1m2 ((h + 2∂ χ )2 − (h + 2∂ χ α)2). (2.44 ) FPmass 8 μν (μ ν) α
In unitary gauge, i.e., in the gauge where the Stückelberg fields χa are set to zero, the Fierz–Pauli mass term simply reduces to
( ) ℒFPmass = − 1m2 h2 − h2 , (2.45 ) 8 μν
where once again the indices are raised and lowered with respect to the Minkowski metric.

Propagating degrees of freedom

To identify the propagating degrees of freedom we may split χa further into a transverse and a longitudinal mode,

a 1- a 1-- ab χ = m A + m2 η ∂bπ, (2.46 )
(where the normalization with negative factors of m has been introduced for further convenience).

In terms of hμν and the Stückelberg fields Aμ and π the linearized Fierz–Pauli action is

1- μν ˆαβ 1- μν 1- 2 ℒFP = − 4 h ℰμν h αβ − 2h (Πμν − [Π]ημν) − 8F μν (2.47 ) 1 ( ) 1 − --m2 h2μν − h2 − --m (hμν − hημν) ∂(μA ν), 8 2
with F μν = ∂μA ν − ∂ νAμ and Πμν = ∂μ ∂νπ and all the indices are raised and lowered with respect to the Minkowski metric.

Terms on the first line represent the kinetic terms for the different fields while the second line represent the mass terms and mixing.

We see that the kinetic term for the field π is hidden in the mixing with hμν. To make the field content explicit, we may diagonalize this mixing by shifting h μν = &tidle;hμν + πημν and the linearized Fierz–Pauli action is

1&tidle;μν ˆαβ&tidle; 3- 2 1- 2 ℒFP = − 4h ℰ μνhαβ − 4 (∂π) − 8F μν (2.48 ) 1 2 2 2 3 2 2 3 2 − -m (&tidle;hμν − &tidle;h ) + -m π + --m π&tidle;h 8 2 2 − 1m (&tidle;hμν − &tidle;hημν)∂(μA ν) + 3m π ∂αA α. 2
This decomposition allows us to identify the different degrees of freedom present in massive gravity (at least at the linear level): h μν represents the helicity-2 mode as already present in GR and propagates 2 dofs, A μ represents the helicity-1 mode and propagates 2 dofs, and finally π represents the helicity-0 mode and propagates 1 dof, leading to a total of five dofs as is to be expected for a massive spin-2 field in four dimensions.

The degrees of freedom have not yet been split into their mass eigenstates but on doing so one can easily check that all the degrees of freedom have the same positive mass square 2 m.

Most of the phenomenology and theoretical consistency of massive gravity is related to the dynamics of the helicity-0 mode. The coupling to matter occurs via the coupling hμνT μν = &tidle;hμνT μν + πT, where T is the trace of the external stress-energy tensor. We see that the helicity-0 mode couples directly to conserved sources (unlike in the case of the Proca field) but the helicity-1 mode does not. In most of what follows we will thus be able to ignore the helicity-1 mode.

Higgs mechanism for gravity

As we shall see in Section 9.1, the graviton mass can also be promoted to a scalar function of one or many other fields (for instance of a different scalar field), m = m (ψ ). We can thus wonder whether a dynamical Higgs mechanism for gravity can be considered where the field(s) ψ start in a phase for which the graviton mass vanishes, m (ψ) = 0 and dynamically evolves to acquire a non-vanishing vev for which m (ψ ) ⁄= 0. Following the same logic as the Abelian Higgs for electromagnetism, this strategy can only work if the number of dofs in the massless phase m = 0 is the same as that in the massive case m ⁄= 0. Simply promoting the mass to a function of an external field is thus not sufficient since the graviton helicity-0 and -1 modes would otherwise be infinitely strongly coupled as m → 0.

To date no candidate has been proposed for which the graviton mass could dynamically evolve from a vanishing value to a finite one without falling into such strong coupling issues. This does not imply that Higgs mechanism for gravity does not exist, but as yet has not been found. For instance on AdS, there could be a Higgs mechanism as proposed in [431], where the mass term comes from integrating out some conformal fields with slightly unusual (but not unphysical) ‘transparent’ boundary conditions. This mechanism is specific to AdS and to the existence of time-like boundary and would not apply on Minkowski or dS.

2.2.3 Van Dam–Veltman–Zakharov discontinuity

As in the case of spin-1, the massive spin-2 field propagates more dofs than the massless one. Nevertheless, these new excitations bear no observational signatures for the spin-1 field when considering an arbitrarily small mass, as seen in Section 2.1.2. The main reason for that is that the helicity-0 polarization of the photon couple only to the divergence of external sources which vanishes for conserved sources. As a result no external sources directly excite the helicity-0 mode of a massive spin-1 field. For the spin-2 field, on the other hand, the situation is different as the helicity-0 mode can now couple to the trace of the stress-energy tensor and so generic sources will excite not only the 2 helicity-2 polarization of the graviton but also a third helicity-0 polarization, which could in principle have dramatic consequences. To see this more explicitly, let us compute the gravitational exchange amplitude between two sources μν T and ′μν T in both the massive and massless gravitational cases.

In the massless case, the theory is diffeomorphism invariant. When considering coupling to external sources, of the form hμνT μν, we thus need to ensure that the symmetry be preserved, which implies that the stress-energy tensor μν T should be conserved μν ∂μT = 0. When computing the gravitational exchange amplitude between two sources we thus restrict ourselves to conserved ones. In the massive case, there is a priori no reasons to restrict ourselves to conserved sources, so long as their divergences cancel in the massless limit m → 0.

Massive spin-2 field

Let us start with the massive case, and consider the response to a conserved external source T μν,

2 ℒ = − 1hμνℰˆαβhαβ − m--(h2 − h2) + --1--h μνTμν. (2.49 ) 4 μν 8 μν 2MPl
The linearized Einstein equation is then
ℰˆμανβhαβ + 1-m2(h μν − h ημν) =-1--Tμν. (2.50 ) 2 MPl
To solve this modified linearized Einstein equation for hμν we consider the trace and the divergence separately,
( ) 1 2 αβ h = − 3m2M---- T + m2-∂α∂βT (2.51 ) P(l ) μ ---1--- μ 1- --2- αβ ∂μhν = m2MPl ∂ μTν + 3 ∂νT + 3m2 ∂ν∂α∂βT . (2.52 )
As is already apparent at this level, the massless limit m → 0 is not smooth which is at the origin of the vDVZ discontinuity (for instance we see immediately that for a conserved source the linearized Ricci scalar vanishes ∂μ∂νh μν − □h = 0 see Refs. [465*, 497*]. This linearized vDVZ discontinuity was recently repointed out in [193].) As has been known for many decades, this discontinuity (or the fact that the Ricci scalar vanishes) is an artefact of the linearized theory and is resolved by the Vainshtein mechanism [463*] as we shall see later.

Plugging these expressions back into the modified Einstein equation, we get

( ) 1 [ 1 2 1 □ − m2 hμν = − ---- T μν − -T ημν − --2∂(μ∂αTνα) + ---2∂μ∂ νT (2.53 ) MPl 3 m 3m ] + --1- ∂ ∂ T αβη + -2--∂ ∂ ∂ ∂ T αβ 3m2 α β μν 3m4 μ ν α β 1 [ 1 ] = ---- &tidle;ημ(αη&tidle;νβ) − -η&tidle;μν&tidle;ηαβ Tα β, (2.54 ) MPl 3
1 &tidle;ημν = ημν − --2∂μ∂ ν. (2.55 ) m
The propagator for a massive spin-2 field is thus given by
fmassive Gmassive(x,x′) = -μναβ--, (2.56 ) μναβ □ − m2
where fmassive μναβ is the polarization tensor,
fmassive= &tidle;ημ(αη&tidle;νβ) − 1η&tidle;μν&tidle;ηαβ. (2.57 ) μναβ 3
In Fourier space we have
massive --2- 1- fμναβ (pμ,m ) = 3m4 pμp νpαpβ + ημ(αηνβ) − 3 ημνηαβ (2.58 ) 1 ( 1 1 ) + --- pαp (μην)β + pβp(μην)α − -pμpνηαβ − -pαpβημν . m2 3 3
The amplitude exchanged between two sources T μν and T ′ μν via a massive spin-2 field is thus given by
∫ ∫ massive massive 4 ′μν 4 ′μν-fμναβ-- αβ 𝒜 TT′ = d x h μνT = d x T □ − m2 T . (2.59 )
As mentioned previously, to compare this result with the massless case, the sources ought to be conserved in the massless limit, ∂μTνμ,∂μTνμ′ → 0 as m → 0. The gravitational exchange amplitude in the massless limit is thus given by
∫ 1 ( 1 ) 𝒜mTT→′0 d4x T′μν-- Tμν − -T ημν . (2.60 ) □ 3
We now compare this result with the amplitude exchanged by a purely massless graviton.

Massless spin-2 field

In the massless case, the equation of motion (2.50*) reduces to the linearized Einstein equation

ˆℰαβh = -1--T , (2.61 ) μν αβ MPl μν
where diffeomorphism invariance requires the stress-energy to be conserved, μ ∂ μTν = 0. In this case the transverse part of this equation is trivially satisfied (as a consequence of the Bianchi identity which follows from symmetry). Since the theory is invariant under diffeomorphism transformations (2.38*), one can choose a gauge of our choice, for instance de Donder (or harmonic) gauge
μ 1 ∂μh ν = 2pν. (2.62 )
In de Donder gauge, the Einstein equation then reduces to
( ) 2 2 1 (□ − m )hμν = − M--- T μν − 2T ημν . (2.63 ) Pl
The propagator for a massless spin-2 field is thus given by
massless massless fμναβ--- G μναβ = □ , (2.64 )
where massless fμναβ is the polarization tensor,
massless 1- f μναβ = ημ(α ηνβ) − 2 ημνηαβ. (2.65 )
The amplitude exchanged between two sources Tμν and ′ Tμν via a genuinely massless spin-2 field is thus given by
∫ ( ) massless 2 4 ′μν 1 1 𝒜T T′ = − M--- d x T □- Tμν − 2-Tημν , (2.66 ) Pl
and differs from the result (2.60*) in the small mass limit. This difference between the massless limit of the massive propagator and the massless propagator (and gravitational exchange amplitude) is a well-known fact and was first pointed out by van Dam, Veltman and Zakharov in 1970 [465, 497]. The resolution to this ‘problem’ lies within the Vainshtein mechanism [463]. In 1972, Vainshtein showed that a theory of massive gravity becomes strongly coupled a low energy scale when the graviton mass is small. As a result, the linear theory is no longer appropriate to describe the theory in the limit of small mass and one should keep track of the non-linear interactions (very much as what we do when approaching the Schwarzschild radius in GR.) We shall see in Section 10.1 how a special set of interactions dominate in the massless limit and are responsible for the screening of the extra degrees of freedom present in massive gravity.

Another ‘non-GR’ effect was also recently pointed out in Ref. [280] where a linear analysis showed that massive gravity predicts different spin-orientations for spinning objects.

2.3 From linearized diffeomorphism to full diffeomorphism invariance

When considering the massless and non-interactive spin-2 field in Section 2.2.1, the linear gauge invariance (2.38*) is exact. However, if this field is to be probed and communicates with the rest of the world, the gauge symmetry is forced to include non-linear terms which in turn forces the kinetic term to become fully non-linear. The result is the well-known fully covariant Einstein–Hilbert term √--- M 2Pl − gR, where R is the scalar curvature associated with the metric g μν = ημν + h μν∕MPl.

To see this explicitly, let us start with the linearized theory and couple it to an external source T μν 0, via the coupling

linear -1--- μν ℒmatter = 2MPl hμνT0 . (2.67 )
This coupling preserves diffeomorphism invariance if the source is conserved, ∂μT μν= 0 0. To be more explicit, let us consider a massless scalar field φ which satisfies the Klein–Gordon equation □φ = 0. A natural choice for the stress-energy tensor μν T is then
T μν= ∂μφ∂ νφ − 1-(∂ φ)2ημν, (2.68 ) 0 2
so that the Klein–Gordon equation automatically guarantees the conservation of the stress-energy tensor on-shell at the linear level and linearized diffeomorphism invariance. However, the very coupling between the scalar field and the spin-2 field affects the Klein–Gordon equation in such a way that beyond the linear order, the stress-energy tensor given in (2.68*) fails to be conserved. When considering the coupling (2.67*), the Klein–Gordon equation receives corrections of the order of hμν∕MPl
( ) □ φ = -1-- ∂α(hαβ∂ βφ) − 1∂ α(hβ∂αφ ) , (2.69 ) MPl 2 β
implying a failure of conservation of T μν 0 at the same order,
( ) μν ∂νφ α β 1 β α ∂μT 0 = M--- ∂ (hαβ∂ φ ) − 2∂α(hβ ∂ φ) . (2.70 ) Pl
The resolution is of course to include non-linear corrections in h∕MPl in the coupling with external matter,
--1-- μν --1-- μναβ ℒmatter = 2MPl h μνT0 + 2M 2h μνhαβT1 + ⋅⋅⋅, (2.71 ) Pl
and promote diffeomorphism invariance to a non-linearly realized gauge symmetry, symbolically,
h → h + ∂ξ + -1--∂(h ξ) + ⋅⋅⋅, (2.72 ) MPl
so this gauge invariance is automatically satisfied on-shell order by order in h ∕MPl, i.e., the scalar field (or general matter field) equations of motion automatically imply the appropriate relation for the stress-energy tensor to all orders in h ∕MPl. The resulting symmetry is the well-known fully non-linear coordinate transformation invariance (or full diffeomorphism invariance or covariance4), which requires the stress-energy tensor to be covariantly conserved. To satisfy this symmetry, the kinetic term (2.36*) should then be promoted to a fully non-linear contribution,
spin−2 1 spin−2 M 2 √ --- ℒkin linear = − -h μν ˆℰαμβν h αβ −→ ℒkin covariant = --Pl − gR[g]. (2.73 ) 4 2
Just as the linearized version h μν ˆℰαμβν h αβ was unique, the non-linear realization √ --- − gR is also unique.5 As a result, any theory of an interacting spin-2 field is necessarily fully non-linear and leads to the theory of gravity where non-linear diffeomorphism invariance (or covariance) plays the role of the local gauge symmetry that projects out four out of the potential six degrees of freedom of the graviton and prevents the excitation of any ghost by the kinetic term.

The situation is very different from that of a spin-1 field as seen earlier, where coupling with other fields can be implemented at the linear order without affecting the U (1) gauge symmetry. The difference is that in the case of a U(1) symmetry, there is a unique nonlinear completion of that symmetry, i.e., the unique nonlinear completion of a U (1) is nothing else but a U (1). Thus any nonlinear Lagrangian which preserves the full U(1) symmetry will be a consistent interacting theory. On the other hand, for spin-2 fields, there are two, and only two ways to nonlinearly complete linear diffs, one as linear diffs in the full theory and the other as full non-linear diffs. While it is possible to write self-interactions which preserve linear diffs, there are no interactions between matter and hμν which preserve linear diffs. Thus any theory of gravity must exhibit full nonlinear diffs and is in this sense what leads us to GR.

2.4 Non-linear Stückelberg decomposition

On the need for a reference metric

We have introduced the spin-2 field hμν as the perturbation about flat spacetime. When considering the theory of a field of given spin it is only natural to work with Minkowski as our spacetime metric, since the notion of spin follows from that of Poincaré invariance. Now when extending the theory non-linearly, we may also extend the theory about different reference metric. When dealing with a reference metric different than Minkowski, one loses the interpretation of the field as massive spin-2, but one can still get a consistent theory. One could also wonder whether it is possible to write a theory of massive gravity without the use of a reference metric at all. This interesting question was investigated in [75*], where it shown that the only consistent alternative is to consider a function of the metric determinant. However, as shown in [75*], the consistent function of the determinant is the cosmological constant and does not provide a mass for the graviton.

Non-linear Stückelberg

Full diffeomorphism invariance (or covariance) indicates that the theory should be built out of scalar objects constructed out of the metric gμν and other tensors. However, as explained previously a theory of massive gravity requires the notion of a reference metric6 fμν (which may be Minkowski f μν = ημν) and at the linearized level, the mass for gravity was not built out of the full metric g μν, but rather out of the fluctuation h μν about this reference metric which does not transform as a tensor under general coordinate transformations. As a result the mass term breaks covariance.

This result is already transparent at the linear level where the mass term (2.39*) breaks linearized diffeomorphism invariance. Nevertheless, that gauge symmetry can always be ‘formally’ restored using the Stückelberg trick which amounts to replacing the reference metric (so far we have been working with the flat Minkowski metric as the reference), to

-2-- ημν − → (ημν − MPl ∂(μ χν)), (2.74 )
and transforming χμ under linearized diffeomorphism in such a way that the combination h μν − 2 ∂(μχ ν) remains invariant. Now that the symmetry is non-linearly realized and replaced by general covariance, this Stückelberg trick should also be promoted to a fully covariant realization.

Following the same Stückelberg trick non-linearly, one can ‘formally restore’ covariance by including four Stückelberg fields a ϕ (a = 0,1,2,3) and promoting the reference metric f μν, which may of may not be Minkowski, to a tensor f&tidle;μν [446*, 27*],

fμν −→ f&tidle;μν = ∂μ ϕa∂νϕbfab (2.75 )
As we can see from this last expression, &tidle;fμν transforms as a tensor under coordinate transformations as long as each of the four fields ϕa transform as scalars. We may now construct the theory of massive gravity as a scalar Lagrangian of the tensors f&tidle;μν and gμν. In unitary gauge, where the Stückelberg fields are ϕa = xa, we simply recover &tidle;f = f μν μν.

This Stückelberg trick for massive gravity dates already from Green and Thorn [267] and from Siegel [446], introduced then within the context of open string theory. In the same way as the massless graviton naturally emerges in the closed string sector, open strings also have spin-2 excitations but whose lowest energy state is massive at tree level (they only become massless once quantum corrections are considered). Thus at the classical level, open strings contain a description of massive excitations of a spin-2 field, where gauge invariance is restored thanks to same Stückelberg fields as introduced in this section. In open string theory, these Stückelberg fields naturally arise from the ghost coordinates. When constructing the non-linear theory of massive gravity from extra dimension, we shall see that in that context the Stückelberg fields naturally arise at the shift from the extra dimension.

For later convenience, it will be useful to construct the following tensor quantity,

μ μα &tidle; μ a b 𝕏 ν = g fαν = ∂ ϕ ∂νϕ fab, (2.76 )
in unitary gauge, 𝕏 = g −1f.

Alternative Stückelberg trick

An alternative way to Stückelberize the reference metric fμν is to express it as

gacfcb → 𝕐ab = gμν∂μϕa∂ νϕcfcb. (2.77 )
As nicely explained in Ref. [14*], both matrices X μ ν and Y a b have the same eigenvalues, so one can choose either one of them in the definition of the massive gravity Lagrangian without any distinction. The formulation in terms of Y rather than X was originally used in Ref. [94], although unsuccessfully as the potential proposed there exhibits the BD ghost instability, (see for instance Ref. [60]).

Helicity decomposition

If we now focus on the flat reference metric, fμν = ημν, we may further split the Stückelberg fields as a a -1- a ϕ = x − MPlχ and identify the index a with a Lorentz index,7 we obtain the non-linear generalization of the Stückelberg trick used in Section 2.2.2

&tidle; -2-- --1- a b ημν −→ fμν = η μν − M ∂(μχ ν) + M 2∂ μχ ∂νχ ηab (2.78 ) Pl Pl = η μν − --2---∂(μA ν) − --2----Πμν (2.79 ) MPlm MPlm2 1 α 2 α 1 2 + M-2m2--∂μA ∂ν𝒜 α + M-2-m3-∂μA Πνα + M--2m4-Π μν, Pl Pl Pl
where in the second equality we have used the split performed in (2.46*) of χa in terms of the helicity-0 and -1 modes and all indices are raised and lowered with respect to ημν.

In other words, the fluctuations about flat spacetime are promoted to the tensor H μν

( ) h μν = MPl (gμν − η μν) − → H μν = MPl gμν − &tidle;fμν (2.80 )
1 a b H μν = hμν + 2∂(μχν) − M---ηab∂μχ ∂ νχ (2.81 ) Pl = hμν + -2∂ (μA ν) + 2--Πμν (2.82 ) m m2 ---1--- α ---2--- α ---1--- 2 − MPlm2 ∂μA ∂ν𝒜 α − MPlm3 ∂μA Π να − MPlm4 Π μν.
The field χa are introduced to restore the gauge invariance (full diffeomorphism invariance). We can now always set a gauge where h μν is transverse and traceless at the linearized level and A μ is transverse. In this gauge the quantities h μν, A μ and π represent the helicity decomposition of the metric. h μν is the helicity-2 part of the graviton, Aμ the helicity-1 part and π the helicity-0 part. The fact that these quantities continue to correctly identify the physical degrees of freedom non-linearly in the limit MPl → ∞ is non-trivial and has been derived in [143*].

Non-linear Fierz–Pauli

The most straightforward non-linear extension of the Fierz–Pauli mass term is as follows

√ ---( ) ℒ(nl1)= − m2M 2Pl − g [(𝕀 − 𝕏)2] − [𝕀 − 𝕏]2 , (2.83 ) FP
this mass term is then invariant under non-linear coordinate transformations. This non-linear formulation was used for instance in [27]. Alternatively, one may also generalize the Fierz–Pauli mass non-linearly as follows [75*]
ℒ (nl2)= − m2M 2 √ − g-√det-𝕏-([(𝕀 − 𝕏 −1)2] − [𝕀 − 𝕏 −1]2). (2.84 ) FP Pl
A priori, the linear Fierz–Pauli action for massive gravity can be extended non-linearly in an arbitrary number of ways. However, as we shall see below, most of these generalizations generate a ghost non-linearly, known as the Boulware–Deser (BD) ghost. In Part II, we shall see that the extension of the Fierz–Pauli to a non-linear theory free of the BD ghost is unique (up to two constant parameters).

2.5 Boulware–Deser ghost

The easiest way to see the appearance of a ghost at the non-linear level is to follow the Stückelberg trick non-linearly and observe the appearance of an Ostrogradsky instability [111*, 173*], although the original formulation was performed in unitary gauge in [75*] in the ADM language (Arnowitt, Deser and Misner, see Ref. [29]). In this section we shall focus on the flat reference metric, fμν = ημν.

Focusing solely on the helicity-0 mode π to start with, the tensor 𝕏 μ ν defined in (2.76*) is expressed as

𝕏 μν = δμν − ---2---Π μν + ---1---Π μαΠαν , (2.85 ) MPlm2 M 2Plm4
where at this level all indices are raised and lowered with respect to the flat reference metric η μν. Then the Fierz–Pauli mass term (2.83*) reads
( ) ( ) ( ) ℒ(nl1) = − -4- [Π2 ] − [Π]2 + ---4--- [Π3] − [Π][Π2 ] + --1---- [Π4] − [Π2 ]2 . (2.86 ) FP,π m2 MPlm4 M 2Plm6
Upon integration by parts, we notice that the quadratic term in (2.86*) is a total derivative, which is another way to see the special structure of the Fierz–Pauli mass term. Unfortunately this special fact does not propagate to higher order and the cubic and quartic interactions are genuine higher order operators which lead to equations of motion with quartic and cubic derivatives. In other words these higher order operators ([Π3 ] − [Π ][Π2]) and ([Π4] − [Π2 ]2) propagate an additional degree of freedom which by Ostrogradsky’s theorem, always enters as a ghost. While at the linear level, these operators might be irrelevant, their existence implies that one can always find an appropriate background configuration π = π0 + δπ, such that the ghost is manifest
(nl1) --4---- μναβ ℒ FP,π = M m4 Z ∂μ∂ νδπ∂α∂β δπ, (2.87 ) Pl
with μναβ μ α νβ μα νβ μ ν αβ Z = 3∂ ∂ π0η − □ π0η η − 2∂ ∂ π0η + ⋅⋅⋅. This implies that non-linearly (or around a non-trivial background), the Fierz–Pauli mass term propagates an additional degree of freedom which is a ghost, namely the BD ghost. The mass of this ghost depends on the background configuration π0,
2 MPlm4-- m ghost ∼ ∂2π0 . (2.88 )
As we shall see below, the resolution of the vDVZ discontinuity lies in the Vainshtein mechanism for which the field takes a large vacuum expectation value, ∂2 π0 ≫ MPlm2, which in the present context would lead to a ghost with an extremely low mass, m2ghost ≲ m2.

Choosing another non-linear extension for the Fierz–Pauli mass term as in (2.84*) does not seem to help much,

ℒ (nl2) = − -4-([Π2 ] − [Π]2) −---4---([Π ]3 − 4[Π][Π2 ] + 3 [Π3 ]) + ⋅⋅⋅ FP,π m2 MPlm4 4 ( 2 3 ) → ------4 [Π ][Π ] − [Π ] + ⋅⋅⋅ (2.89 ) MPlm
where we have integrated by parts on the second line, and we recover exactly the same type of higher derivatives already at the cubic level, so the BD ghost is also present in (2.84*).

Alternatively the mass term was also generalized to include curvature invariants as in Ref. [69]. This theory was shown to be ghost-free at the linear level on FLRW but not yet non-linearly.

Function of the Fierz–Pauli mass term

As an extension of the Fierz–Pauli mass term, one could instead write a more general function of it, as considered in Ref. [75*]

2√--- ( μν αβ ) ℒF(FP) = − m − gF g g (H μαH νβ − H μνH αβ) , (2.90 )
however, one can easily see, if a mass term is actually present, i.e., ′ F ⁄= 0, there is no analytic choice of the function F which would circumvent the non-linear propagation of the BD ghost. Expanding F into a Taylor expansion, we see for instance that the only choice to prevent the cubic higher-derivative interactions in π, [Π3] − [Π ][Π2 ] is F ′(0) = 0, which removes the mass term at the same time. If F (0) ⁄= 0 but F ′(0) = 0, the theory is massless about the specific reference metric, but infinitely strongly coupled about other backgrounds.

Instead to prevent the presence of the BD ghost fully non-linearly (or equivalently about any background), one should construct the mass term (or rather potential term) in such a way, that all the higher derivative operators involving the helicity-0 mode 2 n (∂ π ) are total derivatives. This is precisely what is achieved in the “ghost-free” model of massive gravity presented in Part II. In the next Part I we shall use higher dimensional GR to get some insight and intuition on how to construct a consistent theory of massive gravity.

  Go to previous page Scroll to top Go to next page