### 2.1 The Einstein equivalence principle

The principle of equivalence has historically played an important role in the development of gravitation theory. Newton regarded this principle as such a cornerstone of mechanics that he devoted the opening paragraph of the Principia to it. In 1907, Einstein used the principle as a basic element in his development of general relativity. We now regard the principle of equivalence as the foundation, not of Newtonian gravity or of GR, but of the broader idea that spacetime is curved. Much of this viewpoint can be traced back to Robert Dicke, who contributed crucial ideas about the foundations of gravitation theory between 1960 and 1965. These ideas were summarized in his influential Les Houches lectures of 1964 [93], and resulted in what has come to be called the Einstein equivalence principle (EEP).

One elementary equivalence principle is the kind Newton had in mind when he stated that the property of a body called “mass” is proportional to the “weight”, and is known as the weak equivalence principle (WEP). An alternative statement of WEP is that the trajectory of a freely falling “test” body (one not acted upon by such forces as electromagnetism and too small to be affected by tidal gravitational forces) is independent of its internal structure and composition. In the simplest case of dropping two different bodies in a gravitational field, WEP states that the bodies fall with the same acceleration (this is often termed the Universality of Free Fall, or UFF).

The Einstein equivalence principle (EEP) is a more powerful and far-reaching concept; it states that:

1. WEP is valid.
2. The outcome of any local non-gravitational experiment is independent of the velocity of the freely-falling reference frame in which it is performed.
3. The outcome of any local non-gravitational experiment is independent of where and when in the universe it is performed.

The second piece of EEP is called local Lorentz invariance (LLI), and the third piece is called local position invariance (LPI).

For example, a measurement of the electric force between two charged bodies is a local non-gravitational experiment; a measurement of the gravitational force between two bodies (Cavendish experiment) is not.

The Einstein equivalence principle is the heart and soul of gravitational theory, for it is possible to argue convincingly that if EEP is valid, then gravitation must be a “curved spacetime” phenomenon, in other words, the effects of gravity must be equivalent to the effects of living in a curved spacetime. As a consequence of this argument, the only theories of gravity that can fully embody EEP are those that satisfy the postulates of “metric theories of gravity”, which are:

1. Spacetime is endowed with a symmetric metric.
2. The trajectories of freely falling test bodies are geodesics of that metric.
3. In local freely falling reference frames, the non-gravitational laws of physics are those written in the language of special relativity.

The argument that leads to this conclusion simply notes that, if EEP is valid, then in local freely falling frames, the laws governing experiments must be independent of the velocity of the frame (local Lorentz invariance), with constant values for the various atomic constants (in order to be independent of location). The only laws we know of that fulfill this are those that are compatible with special relativity, such as Maxwell’s equations of electromagnetism. Furthermore, in local freely falling frames, test bodies appear to be unaccelerated, in other words they move on straight lines; but such “locally straight” lines simply correspond to “geodesics” in a curved spacetime (TEGP 2.3 [281]).

General relativity is a metric theory of gravity, but then so are many others, including the Brans–Dicke theory and its generalizations. Theories in which varying non-gravitational constants are associated with dynamical fields that couple to matter directly are not metric theories. Neither, in this narrow sense, is superstring theory (see Section 2.3), which, while based fundamentally on a spacetime metric, introduces additional fields (dilatons, moduli) that can couple to material stress-energy in a way that can lead to violations, say, of WEP. It is important to point out, however, that there is some ambiguity in whether one treats such fields as EEP-violating gravitational fields, or simply as additional matter fields, like those that carry electromagnetism or the weak interactions. Still, the notion of curved spacetime is a very general and fundamental one, and therefore it is important to test the various aspects of the Einstein equivalence principle thoroughly. We first survey the experimental tests, and describe some of the theoretical formalisms that have been developed to interpret them. For other reviews of EEP and its experimental and theoretical significance, see [126162].

#### 2.1.1 Tests of the weak equivalence principle

A direct test of WEP is the comparison of the acceleration of two laboratory-sized bodies of different composition in an external gravitational field. If the principle were violated, then the accelerations of different bodies would differ. The simplest way to quantify such possible violations of WEP in a form suitable for comparison with experiment is to suppose that for a body with inertial mass , the passive gravitational mass is no longer equal to , so that in a gravitational field , the acceleration is given by . Now the inertial mass of a typical laboratory body is made up of several types of mass-energy: rest energy, electromagnetic energy, weak-interaction energy, and so on. If one of these forms of energy contributes to differently than it does to , a violation of WEP would result. One could then write

where is the internal energy of the body generated by interaction , is a dimensionless parameter that measures the strength of the violation of WEP induced by that interaction, and is the speed of light. A measurement or limit on the fractional difference in acceleration between two bodies then yields a quantity called the “Eötvös ratio” given by
where we drop the subscript “I” from the inertial masses. Thus, experimental limits on place limits on the WEP-violation parameters .

Many high-precision Eötvös-type experiments have been performed, from the pendulum experiments of Newton, Bessel, and Potter to the classic torsion-balance measurements of Eötvös [100], Dicke [94], Braginsky [43], and their collaborators. In the modern torsion-balance experiments, two objects of different composition are connected by a rod or placed on a tray and suspended in a horizontal orientation by a fine wire. If the gravitational acceleration of the bodies differs, and this difference has a component perpendicular to the suspension wire, there will be a torque induced on the wire, related to the angle between the wire and the direction of the gravitational acceleration . If the entire apparatus is rotated about some direction with angular velocity , the torque will be modulated with period . In the experiments of Eötvös and his collaborators, the wire and were not quite parallel because of the centripetal acceleration on the apparatus due to the Earth’s rotation; the apparatus was rotated about the direction of the wire. In the Dicke and Braginsky experiments, was that of the Sun, and the rotation of the Earth provided the modulation of the torque at a period of 24 hr (TEGP 2.4 (a) [281]). Beginning in the late 1980s, numerous experiments were carried out primarily to search for a “fifth force” (see Section 2.3.1), but their null results also constituted tests of WEP. In the “free-fall Galileo experiment” performed at the University of Colorado, the relative free-fall acceleration of two bodies made of uranium and copper was measured using a laser interferometric technique. The “Eöt-Wash” experiments carried out at the University of Washington used a sophisticated torsion balance tray to compare the accelerations of various materials toward local topography on Earth, movable laboratory masses, the Sun and the galaxy [24919], and have reached levels of 3  10–13 [2]. The resulting upper limits on are summarized in Figure 1 (TEGP 14.1 [281]; for a bibliography of experiments up to 1991, see [107]).

A number of projects are in the development or planning stage to push the bounds on even lower. The project MICROSCOPE (MICRO-Satellite à Trainée Compensée pour l’Observation du Principe d’Équivalence) is designed to test WEP to 10–15. It is being developed by the French space agency CNES for a possible launch in March, 2008, for a one-year mission [59]. The drag-compensated satellite will be in a Sun-synchronous polar orbit at 700 km altitude, with a payload consisting of two differential accelerometers, one with elements made of the same material (platinum), and another with elements made of different materials (platinum and titanium).

Another, known as Satellite Test of the Equivalence Principle (STEP) [247], is under consideration as a possible joint effort of NASA and the European Space Agency (ESA), with the goal of a 10–18 test. STEP would improve upon MICROSCOPE by using cryogenic techniques to reduce thermal noise, among other effects. At present, STEP (along with a number of variants, called MiniSTEP and QuickSTEP) has not been approved by any agency beyond the level of basic design studies or supporting research and development. An alternative concept for a space test of WEP is Galileo Galilei [261], which uses a rapidly rotating differential accelerometer as its basic element. Its goal is a bound on at the 10–13 level on the ground and 10–17 in space.

#### 2.1.2 Tests of local Lorentz invariance

Although special relativity itself never benefited from the kind of “crucial” experiments, such as the perihelion advance of Mercury and the deflection of light, that contributed so much to the initial acceptance of GR and to the fame of Einstein, the steady accumulation of experimental support, together with the successful merger of special relativity with quantum mechanics, led to its being accepted by mainstream physicists by the late 1920s, ultimately to become part of the standard toolkit of every working physicist. This accumulation included

• the classic Michelson–Morley experiment and its descendents [18623714146],
• the Ives–Stillwell, Rossi–Hall, and other tests of time-dilation [136229103],
• tests of the independence of the speed of light of the velocity of the source, using both binary X-ray stellar sources and high-energy pions [445],
• tests of the isotropy of the speed of light [50227159].

In addition to these direct experiments, there was the Dirac equation of quantum mechanics and its prediction of anti-particles and spin; later would come the stunningly successful relativistic theory of quantum electrodynamics.

In 2005, on the 100th anniversary of the introduction of special relativity, one might ask “what is there to test?”. Special relativity has been so thoroughly integrated into the fabric of modern physics that its validity is rarely challenged, except by cranks and crackpots. It is ironic then, that during the past several years, a vigorous theoretical and experimental effort has been launched, on an international scale, to find violations of special relativity. The motivation for this effort is not a desire to repudiate Einstein, but to look for evidence of new physics “beyond” Einstein, such as apparent violations of Lorentz invariance that might result from certain models of quantum gravity. Quantum gravity asserts that there is a fundamental length scale given by the Planck length, , but since length is not an invariant quantity (Lorentz–FitzGerald contraction), then there could be a violation of Lorentz invariance at some level in quantum gravity. In brane world scenarios, while physics may be locally Lorentz invariant in the higher dimensional world, the confinement of the interactions of normal physics to our four-dimensional “brane” could induce apparent Lorentz violating effects. And in models such as string theory, the presence of additional scalar, vector, and tensor long-range fields that couple to matter of the standard model could induce effective violations of Lorentz symmetry. These and other ideas have motivated a serious reconsideration of how to test Lorentz invariance with better precision and in new ways.

A simple and useful way of interpreting some of these modern experiments, called the c2-formalism, is to suppose that the electromagnetic interactions suffer a slight violation of Lorentz invariance, through a change in the speed of electromagnetic radiation c relative to the limiting speed of material test particles (c0, made to take the value unity via a choice of units), in other words, c  1 (see Section 2.2.3). Such a violation necessarily selects a preferred universal rest frame, presumably that of the cosmic background radiation, through which we are moving at about 370 km s–1 [167]. Such a Lorentz-non-invariant electromagnetic interaction would cause shifts in the energy levels of atoms and nuclei that depend on the orientation of the quantization axis of the state relative to our universal velocity vector, and on the quantum numbers of the state. The presence or absence of such energy shifts can be examined by measuring the energy of one such state relative to another state that is either unaffected or is affected differently by the supposed violation. One way is to look for a shifting of the energy levels of states that are ordinarily equally spaced, such as the Zeeman-split 2J + 1 ground states of a nucleus of total spin J in a magnetic field; another is to compare the levels of a complex nucleus with the atomic hyperfine levels of a hydrogen maser clock. The magnitude of these “clock anisotropies” would be proportional to .

The earliest clock anisotropy experiments were the Hughes–Drever experiments, performed in the period 1959 – 60 independently by Hughes and collaborators at Yale University, and by Drever at Glasgow University, although their original motivation was somewhat different [13196]. The Hughes–Drever experiments yielded extremely accurate results, quoted as limits on the parameter in Figure 2. Dramatic improvements were made in the 1980s using laser-cooled trapped atoms and ions [21516353]. This technique made it possible to reduce the broading of resonance lines caused by collisions, leading to improved bounds on shown in Figure 2 (experiments labelled NIST, U. Washington and Harvard, respectively).

Also included for comparison is the corresponding limit obtained from Michelson–Morley type experiments (for a review, see [127]). In those experiments, when viewed from the preferred frame, the speed of light down the two arms of the moving interferometer is c, while it can be shown using the electrodynamics of the c2 formalism, that the compensating Lorentz–FitzGerald contraction of the parallel arm is governed by the speed c0 = 1. Thus the Michelson–Morley experiment and its descendants also measure the coefficient c–2 – 1. One of these is the Brillet–Hall experiment [46], which used a Fabry–Perot laser interferometer. In a recent series of experiments, the frequencies of electromagnetic cavity oscillators in various orientations were compared with each other or with atomic clocks as a function of the orientation of the laboratory [29716819012248]. These placed bounds on c–2 – 1 at the level of better than a part in 109. Haugan and Lämmerzahl [125] have considered the bounds that Michelson–Morley type experiments could place on a modified electrodynamics involving a “vector-valued” effective photon mass.

The c2 framework focusses exclusively on classical electrodynamics. It has recently been extended to the entire standard model of particle physics by Kostelecký and colleagues [6364155]. The “Standard Model Extension” (SME) has a large number of Lorentz-violating parameters, opening up many new opportunities for experimental tests (see Section 2.2.4). A variety of clock anisotropy experiments have been carried out to bound the electromagnetic parameters of the SME framework [154]. For example, the cavity experiments described above [297168190] placed bounds on the coefficients of the tensors and (see Section 2.2.4 for definitions) at the levels of 10–14 and 10–10, respectively. Direct comparisons between atomic clocks based on different nuclear species place bounds on SME parameters in the neutron and proton sectors, depending on the nature of the transitions involved. The bounds achieved range from 10–27 to 10–32 GeV.

Astrophysical observations have also been used to bound Lorentz violations. For example, if photons satisfy the Lorentz violating dispersion relation

where is the Planck energy, then the speed of light would be given, to linear order in the by
Such a Lorentz-violating dispersion relation could be a relic of quantum gravity, for instance. By bounding the difference in arrival time of high-energy photons from a burst source at large distances, one could bound contributions to the dispersion for . One limit, comes from observations of 1 and 2 TeV gamma rays from the blazar Markarian 421 [30]. Another limit comes from birefringence in photon propagation: In many Lorentz violating models, different photon polarizations may propagate with different speeds, causing the plane of polarization of a wave to rotate. If the frequency dependence of this rotation has a dispersion relation similar to Equation (3), then by studying “polarization diffusion” of light from a polarized source in a given bandwidth, one can effectively place a bound  [119]. Other testable effects of Lorentz invariance violation include threshold effects in particle reactions, gravitational Cerenkov radiation, and neutrino oscillations.

Mattingly [182] gives a thorough and up-to-date review of both the theoretical frameworks and the experimental results for tests of LLI.

#### 2.1.3 Tests of local position invariance

The principle of local position invariance, the third part of EEP, can be tested by the gravitational redshift experiment, the first experimental test of gravitation proposed by Einstein. Despite the fact that Einstein regarded this as a crucial test of GR, we now realize that it does not distinguish between GR and any other metric theory of gravity, but is only a test of EEP. A typical gravitational redshift experiment measures the frequency or wavelength shift between two identical frequency standards (clocks) placed at rest at different heights in a static gravitational field. If the frequency of a given type of atomic clock is the same when measured in a local, momentarily comoving freely falling frame (Lorentz frame), independent of the location or velocity of that frame, then the comparison of frequencies of two clocks at rest at different locations boils down to a comparison of the velocities of two local Lorentz frames, one at rest with respect to one clock at the moment of emission of its signal, the other at rest with respect to the other clock at the moment of reception of the signal. The frequency shift is then a consequence of the first-order Doppler shift between the frames. The structure of the clock plays no role whatsoever. The result is a shift

where is the difference in the Newtonian gravitational potential between the receiver and the emitter. If LPI is not valid, then it turns out that the shift can be written
where the parameter may depend upon the nature of the clock whose shift is being measured (see TEGP 2.4 (c) [281] for details).

The first successful, high-precision redshift measurement was the series of Pound–Rebka–Snider experiments of 1960 – 1965 that measured the frequency shift of gamma-ray photons from 57Fe as they ascended or descended the Jefferson Physical Laboratory tower at Harvard University. The high accuracy achieved – one percent – was obtained by making use of the Mössbauer effect to produce a narrow resonance line whose shift could be accurately determined. Other experiments since 1960 measured the shift of spectral lines in the Sun’s gravitational field and the change in rate of atomic clocks transported aloft on aircraft, rockets and satellites. Figure 3 summarizes the important redshift experiments that have been performed since 1960 (TEGP 2.4 (c) [281]).

After almost 50 years of inconclusive or contradictory measurements, the gravitational redshift of solar spectral lines was finally measured reliably. During the early years of GR, the failure to measure this effect in solar lines was siezed upon by some as reason to doubt the theory. Unfortunately, the measurement is not simple. Solar spectral lines are subject to the “limb effect”, a variation of spectral line wavelengths between the center of the solar disk and its edge or “limb”; this effect is actually a Doppler shift caused by complex convective and turbulent motions in the photosphere and lower chromosphere, and is expected to be minimized by observing at the solar limb, where the motions are predominantly transverse. The secret is to use strong, symmetrical lines, leading to unambiguous wavelength measurements. Successful measurements were finally made in 1962 and 1972 (TEGP 2.4 (c) [281]). In 1991, LoPresto et al. [172] measured the solar shift in agreement with LPI to about 2 percent by observing the oxygen triplet lines both in absorption in the limb and in emission just off the limb.

The most precise standard redshift test to date was the Vessot–Levine rocket experiment that took place in June 1976 [264]. A hydrogen-maser clock was flown on a rocket to an altitude of about 10,000 km and its frequency compared to a similar clock on the ground. The experiment took advantage of the masers’ frequency stability by monitoring the frequency shift as a function of altitude. A sophisticated data acquisition scheme accurately eliminated all effects of the first-order Doppler shift due to the rocket’s motion, while tracking data were used to determine the payload’s location and the velocity (to evaluate the potential difference , and the special relativistic time dilation). Analysis of the data yielded a limit .

A “null” redshift experiment performed in 1978 tested whether the relative rates of two different clocks depended upon position. Two hydrogen maser clocks and an ensemble of three superconducting-cavity stabilized oscillator (SCSO) clocks were compared over a 10-day period. During the period of the experiment, the solar potential U / c2 changed sinusoidally with a 24-hour period by 3  10–13 because of the Earth’s rotation, and changed linearly at 3  10–12 per day because the Earth is 90 degrees from perihelion in April. However, analysis of the data revealed no variations of either type within experimental errors, leading to a limit on the LPI violation parameter  [258]. This bound has been improved using more stable frequency standards, such as atomic fountain clocks [12021623]. The current bound, from comparing a Cesium atomic fountain with a Hydrogen maser for a year, is  [23].

The varying gravitational redshift of Earth-bound clocks relative to the highly stable millisecond pulsar PSR 1937+21, caused by the Earth’s motion in the solar gravitational field around the Earth-Moon center of mass (amplitude 4000 km), was measured to about 10 percent [251]. Two measurements of the redshift using stable oscillator clocks on spacecraft were made at the one percent level: One used the Voyager spacecraft in Saturn’s gravitational field [158], while another used the Galileo spacecraft in the Sun’s field [160].

The gravitational redshift could be improved to the 10–10 level using an array of laser cooled atomic clocks on board a spacecraft which would travel to within four solar radii of the Sun [180].

Modern advances in navigation using Earth-orbiting atomic clocks and accurate time-transfer must routinely take gravitational redshift and time-dilation effects into account. For example, the Global Positioning System (GPS) provides absolute positional accuracies of around 15 m (even better in its military mode), and 50 nanoseconds in time transfer accuracy, anywhere on Earth. Yet the difference in rate between satellite and ground clocks as a result of relativistic effects is a whopping 39 microseconds per day (46 s from the gravitational redshift, and –7 s from time dilation). If these effects were not accurately accounted for, GPS would fail to function at its stated accuracy. This represents a welcome practical application of GR! (For the role of GR in GPS, see [1516]; for a popular essay, see [287].)

Local position invariance also refers to position in time. If LPI is satisfied, the fundamental constants of non-gravitational physics should be constants in time. Table 1 shows current bounds on cosmological variations in selected dimensionless constants. For discussion and references to early work, see TEGP 2.4 (c) [281] or [97]. For a comprehensive recent review both of experiments and of theoretical ideas that underly proposals for varying constants, see [262].

Experimental bounds on varying constants come in two types: bounds on the present rate of variation, and bounds on the difference between today’s value and a value in the distant past. The main example of the former type is the clock comparison test, in which highly stable atomic clocks of different fundamental type are intercompared over periods ranging from months to years (variants of the null redshift experiment). If the frequencies of the clocks depend differently on the electromagnetic fine structure constant , the electron-proton mass ratio , or the gyromagnetic ratio of the proton , for example, then a limit on a drift of the fractional frequency difference translates into a limit on a drift of the constant(s). The dependence of the frequencies on the constants may be quite complex, depending on the atomic species involved. The most recent experiments have exploited the techniques of laser cooling and trapping, and of atom fountains, in order to achieve extreme clock stability, and compared the Rubidium-87 hyperfine transition [181], the Mercury-199 ion electric quadrupole transition [31], the atomic Hydrogen 1S–2S transition [111], or an optical transition in Ytterbium-171 [209], against the ground-state hyperfine transition in Cesium-133. These experiments show that, today, .

The second type of bound involves measuring the relics of or signal from a process that occurred in the distant past and comparing the inferred value of the constant with the value measured in the laboratory today. One sub-type uses astronomical measurements of spectral lines at large redshift, while the other uses fossils of nuclear processes on Earth to infer values of constants early in geological history.

 Table 1: Bounds on cosmological variation of fundamental constants of non-gravitational physics. For an in-depth review, see [262].
 Constant Limit on Redshift Method (yr–1) Fine structure constant () 30  10–16 0 Clock comparisons    [181, 31, 111,  209] 0.5  10–16 0.15 Oklo Natural Reactor    [72, 116, 210] 3.410–16 0.45 187Re decay in meteorites    [205] (6.41.4)  10–16 0.2 – 3.7 Spectra in distant quasars    [269, 193] 1.2  10–16 0.4 – 2.3 Spectra in distant quasars    [242, 51] Weak interaction constant () 1  10–11 0.15 Oklo Natural Reactor    [72] 5  10–12 109 Big Bang nucleosynthesis    [179, 223] e-p mass ratio 3  10–15 2.6 – 3.0 Spectra in distant quasars    [135]

Earlier comparisons of spectral lines of different atoms or transitions in distant galaxies and quasars produced bounds or on the order of a part in 10 per Hubble time [298]. Dramatic improvements in the precision of astronomical and laboratory spectroscopy, in the ability to model the complex astronomical environments where emission and absorption lines are produced, and in the ability to reach large redshift have made it possible to improve the bounds significantly. In fact, in 1999, Webb et al. [269193] announced that measurements of absorption lines in Mg, Al, Si, Cr, Fe, Ni, and Zn in quasars in the redshift range 0.5  Z  3.5 indicated a smaller value of in earlier epochs, namely  = (–0.72  0.18)  10–5, corresponding to  = (6.4  1.4)  10–16 yr–1 (assuming a linear drift with time). Measurements by other groups have so far failed to confirm this non-zero effect [24251219]; a recent analysis of Mg absorption systems in quasars at 0.4  Z  2.3 gave  = (–0.6  0.6)  10–16 yr–1 [242].

Another important set of bounds arises from studies of the “Oklo” phenomenon, a group of natural, sustained 235U fission reactors that occurred in the Oklo region of Gabon, Africa, around 1.8 billion years ago. Measurements of ore samples yielded an abnormally low value for the ratio of two isotopes of Samarium, 149Sm / 147Sm. Neither of these isotopes is a fission product, but 149Sm can be depleted by a flux of neutrons. Estimates of the neutron fluence (integrated dose) during the reactors’ “on” phase, combined with the measured abundance anomaly, yield a value for the neutron cross-section for 149Sm 1.8 billion years ago that agrees with the modern value. However, the capture cross-section is extremely sensitive to the energy of a low-lying level (E  0.1 eV), so that a variation in the energy of this level of only 20 meV over a billion years would change the capture cross-section from its present value by more than the observed amount. This was first analyzed in 1976 by Shlyakter [241]. Recent reanalyses of the Oklo data [72116210] lead to a bound on at the level of around 510–17 yr–1.

In a similar manner, recent reanalyses of decay rates of 187Re in ancient meteorites (4.5 billion years old) gave the bound  3.4  10–16 yr–1 [205].