Light is bent by the action of a gravitational field. In the case where a galaxy lies close to the line of sight to a background quasar, the quasar’s light may travel along several different paths to the observer, resulting in more than one image.
The easiest way to visualise this is to begin with a zero-mass galaxy (which bends no light rays) acting as the lens, and considering all possible light paths from the quasar to the observer which have a bend in the lens plane. From the observer’s point of view, we can connect all paths which take the same time to reach the observer with a contour, which in this case is circular in shape. The image will form at the centre of the diagram, surrounded by circles representing increasing light travel times. This is of course an application of Fermat’s principle; images form at stationary points in the Fermat surface, in this case at the Fermat minimum. Put less technically, the light has taken a straight-line path between the source and observer.
If we now allow the galaxy to have a steadily increasing mass, we introduce an extra time delay (known as the Shapiro delay) along light paths which pass through the lens plane close to the galaxy centre. This makes a distortion in the Fermat surface. At first, its only effect is to displace the Fermat minimum away from the distortion. Eventually, however, the distortion becomes big enough to produce a maximum at the position of the galaxy, together with a saddle point on the other side of the galaxy from the minimum. By Fermat’s principle, two further images will appear at these two stationary points in the Fermat surface. This is the basic three-image lens configuration, although in practice the central image at the Fermat maximum is highly demagnified and not usually seen.
If the lens is significantly elliptical and the lines of sight are well aligned, we can produce five images, consisting of four images around a ring alternating between maxima and saddle points, and a central, highly demagnified Fermat maximum. Both four-image and two-image systems (“quads” and “doubles”) are in fact seen in practice. The major use of lens systems is for determining mass distributions in the lens galaxy, since the positions and brightnesses of the images carry information about the gravitational potential of the lens. Gravitational lensing has the advantage that its effects are independent of whether the matter is light or dark, so in principle the effects of both baryonic and non-baryonic matter can be probed.
Refsdal  pointed out that if the background source is variable, it is possible to measure an absolute distance within the system and therefore the Hubble constant. To see how this works, consider the light paths from the source to the observer corresponding to the individual lensed images. Although each is at a stationary point in the Fermat time delay surface, the absolute light travel time for each will generally be different, with one of the Fermat minima having the smallest travel time. Therefore, if the source brightens, this brightening will reach the observer at different times corresponding to the two different light paths. Measurement of the time delay corresponds to measuring the difference in the light travel times, each of which is individually given byDl, Ds and Dls are angular diameter distances also defined in Figure 6, zl is the lens redshift, and is a term representing the Shapiro delay of light passing through a gravitational field. Fermat’s principle corresponds to the requirement that = 0. Once the differential time delays are known, we can then calculate the ratio of angular diameter distances which appears in the above equation. If the source and lens redshifts are known, H0 follows immediately. A handy rule of thumb which can be derived from this equation for the case of a 2-image lens, if we make the assumption that the matter distribution is isothermal12 and H0 = 70 km s–1 Mpc–1, is zl is the lens redshift, s is the separation of the images (approximately twice the Einstein radius), f 1 is the ratio of the fluxes and D is the value of Ds Dl / Dls in Gpc. A larger time delay implies a correspondingly lower H0.
The first gravitational lens was discovered in 1979  and monitoring programmes began soon afterwards to determine the time delay. This turned out to be a long process involving a dispute between proponents of a 400-day and a 550-day delay, and ended with a determination of 417 2 days [84, 136]. Since that time, 17 more time delays have been determined (see Table 1). In the early days, many of the time delays were measured at radio wavelengths by examination of those systems in which a radio-loud quasar was the multiply imaged source (see Figure 7). Recently, optically-measured delays have dominated, due to the fact that only a small optical telescope in a site with good seeing is needed for the photometric monitoring, whereas radio time delays require large amounts of time on long-baseline interferometers which do not exist in large numbers13.
Unlike local distance determinations (and even unlike cosmological probes which typically use more than one measurement), there is only one major systematic piece of astrophysics in the determination of H0 by lenses, but it is a very important one. This is the form of the potential in Equation (12). If one parametrises the potential in the form of a power law in projected mass density versus radius, the index is –1 for an isothermal model. This index has a pretty direct degeneracy14 with the deduced length scale and therefore the Hubble constant; for a change of 0.1, the length scale changes by about 10%. The sense of the effect is that a steeper index, which corresponds to a more centrally concentrated mass distribution, decreases all the length scales and therefore implies a higher Hubble constant for a given time delay. The index cannot be varied at will, given that galaxies consist of dark matter potential wells into which baryons have collapsed and formed stars. The basic physics means that it is almost certain that matter cannot be more centrally condensed than the stars, and cannot be less centrally condensed than the theoretically favoured “universal” dark matter profile, known as a NFW profile .
Worse still, all matter along the line of sight contributes to the lensing potential in a particularly unpleasant way; if one has a uniform mass sheet in the region, it does not affect the image positions and fluxes which form the constraints on the lensing potential, but it does affect the time delay. It operates in the sense that, if a mass sheet is present which is not known about, the length scale obtained is too short and consequently the derived value of H0 is too high. This mass-sheet degeneracy  can only be broken by lensing observations alone for a lens system which has sources at multiple redshifts, since there are then multiple measurements of angular diameter distance which are only consistent, for a given mass sheet, with a single value of H0. Existing galaxy lenses do not contain examples of this phenomenon.
Even worse still, there is no guarantee that parametric models describe lens mass distributions to the required accuracy. In a series of papers [128, 172, 129, 127] non-parametric, pixellated models of galaxy mass distributions have been developed which do not require any parametric form, but only basic physical plausibility arguments such as monotonic outwards decrease of mass density. Not surprisingly, error bars obtained by this method are larger than for parametric methods, usually by factors of at least 2.
Table 1 shows the currently measured time delays, with references and comments. Since the most recent review  an extra half-dozen have been added, and there is every reason to suppose that the sample will continue to grow at a similar rate15.
Despite the apparently depressing picture painted in the previous Section 4.1.3 about the prospects for obtaining mass models from lenses, the measurement of H0 is improving in a number of ways.
First, some lenses have more constraints on the mass model than others. The word “more” here is somewhat subjective, but examples include JVAS B0218+357 which in addition to two images, also has VLBI substructure within each image and an Einstein ring formed from an extended background source, and CLASS B1933+503 which has three background radio sources, each multiply imaged. Something of a Murphy’s Law operates in the latter case, however, as compact triple radio sources tend to be of the class known as Compact Symmetric Objects (CSOs) which do not vary and therefore do not give time delays. Einstein rings in general give good constraints  although non-parametric models are capable of soaking up most of these in extra degrees of freedom . In general however, no “golden” lenses with multiple constraints and no other modelling problems have been found16. The best models of all come from lenses from the SLACS survey, which have extended sources  but unfortunately the previous Murphy’s law applies here too; extended sources are not variable.
Second, it is possible to increase the reliability of individual lens mass models by gathering extra information. A major improvement is available by the use of stellar velocity dispersions [159, 158, 160, 82] measured in the lensing galaxy. As a standalone determinant of mass models in galaxies at z 0.5, typical of lens galaxies, such measurements are not very useful as they suffer from severe degeneracies with the structure of stellar orbits. However, the combination of lensing information (which gives a very accurate measurement of mass enclosed by the Einstein radius) and stellar dynamics (which gives, more or less, the mass enclosed within the effective radius of the stellar light) gives a measurement that is in principle a very direct constraint on the mass slope. The method has large error bars, in part due to residual dependencies on the shape of stellar orbits, but also because these measurements are very difficult; each galaxy requires about one night of good seeing on a 10-m telescope. It is also not certain that the mass slope between Einstein and effective radii is always a good indicator of the mass slope within the annulus between the lensed images. Nevertheless, this programme has the extremely beneficial effect of turning a systematic error in each galaxy into a smaller, more-or-less random error.
Third, we can remove problems associated with mass sheets associated with nearby groups by measuring them using detailed studies of the environments of lens galaxies. Recent studies of lens groups [38, 71, 37, 97] show that neglecting matter along the line of sight typically has an effect of 10 – 20%, with matter close to the redshift of the lens contributing most.
Finally, we can improve measurements in individual lenses which have measurement difficulties. For example, in the lenses 1830–211  and B0218+357  the lens galaxy position is not well known. In the case of B0218+357, York et al.  present deep HST/ACS data which allow much better astrometry. Overall, by a lot of hard work using all methods together, the systematic errors involved in the mass model in each lens individually can be reduced to a random error. We can then study lenses as an ensemble.
Early indications, using systems with isolated lens galaxies in uncomplicated environments, and fitting isothermal mass profiles, resulted in rather low values of the Hubble constant (in the high fifties ). In order to be consistent with H0 70 km s–1 Mpc–1, the mass profiles had to be steepened to the point where mass followed light; although not impossible for a single galaxy this was unlikely for an ensemble of lenses. In a subsequent analysis, Dobke and King  did this the other way round; they took the value of H0 = 72 8 km s–1 Mpc–1 in  and deduced that the overall mass slope index in time-delay lenses had to be 0.2 – 0.3 steeper than isothermal. If true, this is worrying because investigation of a large sample of SLACS lenses with well-determined mass slopes  reveals an average slope which is nearly isothermal.
More recent analyses, including information available since then, may be indicating that the lens galaxies in some earlier analyses may indeed, by bad luck, be unusually steep. For example, Treu and Koopmans  find that PG1115+080 has an unusually steep index (0.35 steeper than isothermal) yielding a 20% underestimate of H0. The exact value of H0 from all eighteen lenses together is a rather subjective undertaking as it depends on one’s view of the systematics in each lens and the degree to which they have been reduced to random errors. My estimate on the most optimistic assumptions is 66 3 km s–1 Mpc–1, although you really don’t want to know how I got this17.
A more sophisticated meta-analysis has recently been done in  using a Monte Carlo method to account for quantities such as the presence of clusters around the main lens galaxy and the variation in profile slopes. He obtains (68 6 8) km s–1 Mpc–1. It is, however, still an uncomfortable fact that the individual H0 determinations have a greater spread than would be expected on the basis of the formal errors. Meanwhile, an arguably more realistic approach  is to simultaneously model ten of the eighteen time-delay lenses using fully non-parametric models. This should account more or less automatically for many of the systematics associated with the possible galaxy mass models, although it does not help us with (or excuse us from further work to determine) the presence of mass sheets and their associated degeneracies. The result obtained is km s–1 Mpc–1. These ten lenses give generally higher H0 values from parametric models than the ensemble of 18 known lenses with time delays; the analysis of these ten according to the parametric prescriptions in Appendix A gives H0 = 68.5 rather than 66.2 km s–1 Mpc–1.
To conclude; after a slow start, lensing is beginning to make a useful contribution to determination of H0, although the believable error bars are probably similar to those of local or CMB methods about eight to ten years ago. The results may just be beginning to create a tension with other methods, in the sense that H0 values in the mid-sixties are preferred if lens galaxies are more or less isothermal (see  for discussion of this point). Further work is urgently needed to turn systematic errors into random ones by investigating stellar dynamics and the neighbourhoods of galaxies in lens systems, and to reduce the random errors by increasing the sample of time delay lenses. It is likely, at the current rate of progress, that 5% determinations will be achieved within the next few years, both from work on existing lenses and from new measurements of time delays. It is also worth pointing out that lensing time delays give a secure upper limit on H0, because most of the systematic effects associated with neglect of undetected masses cause overestimates of H0; from existing studies H0 80 km s–1 Mpc–1 is pretty much ruled out. This systematic of course makes any overall estimates of H0 in the mid-sixties from lensing very interesting.
One potentially very clean way to break mass model degeneracies is to discover a lensed type Ia supernova [103, 104]. The reason is that, as we have seen, the intrinsic brightness of SNe Ia can be determined from their lightcurve, and it can be shown that the resulting absolute magnification of the images can then be used to bypass the effective degeneracy between the Hubble constant and the radial mass slope. Oguri et al.  and also Bolton and Burles  discuss prospects for finding such objects; future surveys with the Large Synoptic Survey Telescope (LSST) and the SNAP supernova satellite are likely to uncover significant numbers of such events. With modest investment in investigation of the fields of these objects, a 5% determination of H0 should be possible relatively quickly.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 Germany License.