5.1 Matched filtering and optimal signal-to-noise ratio

Matched filtering is a data analysis technique that efficiently searches for a signal of known shape buried in noisy data [188Jump To The Next Citation Point]. The technique consists in correlating the output of a detector with a waveform, variously known as a template or a filter. Given a signal h (t) buried in noise n (t), the task is to find an ‘optimal’ template q(t) that would produce, on the average, the best possible SNR. In this review, we shall treat the problem of matched filtering as an operational exercise. However, this intuitive picture has a solid basis in the theory of hypothesis testing. The interested reader may consult any standard text book on signal analysis, for example Helstrom [188Jump To The Next Citation Point], for details.

Let us first fix our notation. We shall use x(t) to denote the detector output, which is assumed to consist of a background noise n (t) and a useful gravitational wave signal h(t). The Fourier transform of a quantity x(t) will be denoted &tidle;x(f) and is defined as

∫ ∞ &tidle;x (f ) = x (t)e2πiftdt. (73 ) −∞

5.1.1 Optimal filter

The detector output x(t) is just a realization of noise n (t), i.e., x(t) = n(t), when no signal is present. In the presence of a signal h (t) with an arrival time t a, x(t) takes the form,

x(t) = h(t − t ) + n(t). (74 ) a
The correlation c of a template q(t) with the detector output is defined as
∫ ∞ c(τ) ≡ x (t)q(t + τ) dt. (75 ) −∞
In the above equation, τ is called the lag; it denotes the duration by which the filter function lags behind the detector output. The purpose of the above correlation integral is to concentrate all the signal energy at one place. The following analysis reveals how this is achieved; we shall work out the optimal filter q (t) that maximizes the correlation c(τ) when a signal h(t) is present in the detector output. To do this let us first write the correlation integral in the Fourier domain by substituting for x(t) and q(t), in the above integral, their Fourier transforms &tidle;x(f) and &tidle;q(f), i.e., ∫ x(t) ≡ ∞ x&tidle;(f)exp (− 2πift)df −∞ and ∫∞ q(t) ≡ −∞ q&tidle;(t)exp (− 2πift)df, respectively. After some straightforward algebra, one obtains
∫ ∞ ∗ −2πifτ c(τ ) = &tidle;x(f )q&tidle;(f )e df, (76 ) −∞
where &tidle;q∗(f) denotes the complex conjugate of &tidle;q(f).

Since n is a random process, c is also a random process. Moreover, correlation is a linear operation and hence c obeys the same probability distribution function as n. In particular, if n is described by a Gaussian random process with zero mean, then c is also described by a Gaussian distribution function, although its mean and variance will, in general, differ from those of n. The mean value of c, denoted by S ≡ c, is, clearly, the correlation of the template q with the signal h, since the mean value of n is zero:

∫ ∞ S ≡ c(τ) = &tidle;h(f)&tidle;q∗(f)e−2πif(τ−ta)df. (77 ) −∞
The variance of c, denoted -------- N 2 ≡ (c − c)2, turns out to be,
∫ 2 -------- ∞ 2 N = (c − c)2 = Sh(f) |&tidle;q(f)| df. (78 ) − ∞
Now the SNR ρ is defined by ρ2 ≡ S2∕N 2.

The form of integrals in Equations (77View Equation) and (78View Equation) leads naturally to the definition of the scalar product of waveforms. Given two functions, a (t) and b(t), we define their scalar product ⟨a,b⟩ to be [159Jump To The Next Citation Point161Jump To The Next Citation Point115Jump To The Next Citation Point129Jump To The Next Citation Point]

∫ ∞ -df---[ &tidle;∗ ∗ &tidle; ] ⟨a,b⟩ ≡ 2 S (f) &tidle;a(f)b (f) + &tidle;a (f)b(f) . (79 ) 0 h
Note that Sh(f) ≥ 0 [cf. Equation (67View Equation)], consequently, the scalar product is real and positive definite.

Noting that the Fourier transform of a real function h(t) obeys &tidle;h(− f) = &tidle;h∗(f ), we can write down the SNR in terms of the above scalar product:

⟨he2πif(τ−ta),S q⟩ ρ2 = --∘-----------h--. (80 ) ⟨Shq, Shq⟩
From this it is clear that the template q that obtains the maximum value of ρ is simply
&tidle; i2πf(τ−ta) &tidle;q(f ) = γ h(f)e--------, (81 ) Sh(f )
where γ is an arbitrary constant. From the above expression for an optimal filter we note two important things. First, the SNR is maximized when the lag parameter τ is equal to the time of arrival of the signal ta. Second, the optimal filter is not just a copy of the signal, but rather it is weighted down by the noise PSD. We will see below why this should be so.

5.1.2 Optimal signal-to-noise ratio

We can now work out the optimal SNR by substituting Equation (81View Equation) for the optimal template in Equation (80View Equation),

⌊ | | ⌋ ∫ |&tidle; |2 1∕2 1∕2 | ∞ |h(f)|-| ρopt = ⟨h,h⟩ = 2⌈ df Sh (f) ⌉ . (82 ) 0

We note that the optimal SNR is not just the total energy of the signal (which would be ∫ 2 ∞ df |&tidle;h(f)|2 0), but rather the integrated signal power weighted down by the noise PSD. This is in accordance with what we would guess intuitively: the contribution to the SNR from a frequency bin where the noise PSD is high is smaller than from a bin where the noise PSD is low. Thus, an optimal filter automatically takes into account the nature of the noise PSD.

The expression for the optimal SNR Equation (82View Equation) suggests how one may compare signal strengths with the noise performance of a detector. Note that one cannot directly compare &tidle;h(f) with Sh (f ), as they have different physical dimensions. In gravitational wave literature one writes the optimal SNR in one of the following equivalent ways

⌊ |√-- |2⌋ 1∕2 ⌊ | |2⌋ 1∕2 ∫ ∞ || f &tidle;h(f)|| ∫ ∞ ||f&tidle;h(f)|| ρopt = 2 |⌈ df-----------|⌉ = 2 |⌈ df---------|⌉ , (83 ) 0 f Sh(f ) 0 f f Sh(f)
which facilitates the comparison of signal strengths with noise performance. One can compare the dimensionless quantities, &tidle; f |h(f )| and ∘ ------- fSh (f ), or dimensionful quantities, √ --&tidle; f|h(f )| and ∘ ------ Sh(f).

Signals of interest to us are characterized by several (a priori unknown) parameters, such as the masses of component stars in a binary, their intrinsic spins, etc., and an optimal filter must agree with both the signal shape and its parameters. A filter whose parameters are slightly mismatched with that of a signal can greatly degrade the SNR. For example, even a mismatch of one cycle in 104 cycles can degrade the SNR by a factor two.

When the parameters of a filter and its shape are precisely matched with that of a signal, what is the improvement brought about, as opposed to the case when no knowledge of the signal is available? Matched filtering helps in enhancing the SNR in proportion to the square root of the number of signal cycles in the detector band, as opposed to the case in which the signal shape is unknown and all that can be done is to Fourier transform the detector output and compare the signal energy in a frequency bin to noise energy in that bin. We shall see below that, in initial interferometers, matched filtering leads to an enhancement of order 30 – 100 for compact binary inspiral signals.

5.1.3 Practical applications of matched filtering

Matched filtering is currently being applied to mainly two sources: detection of (1) chirping signals from compact binaries consisting of black holes and/or neutron stars and (2) continuous waves from rapidly-spinning neutron stars.

5.1.3.1 Coalescing binaries.
In the case of chirping binaries, post-Newtonian theory (a perturbative approximation to Einstein’s equations in which the relevant quantities are expanded as a power-series in 1∕c, where c is the speed of light) has been used to model the dynamics of these systems to a very high order in v∕c, where v is the relative speeds of the objects in the binary (see also Section 6.5, in which binaries are discussed in more detail). This is an approximation that can be effectively used to match filter the signal from binaries whose component bodies are of equal, or nearly equal, masses and the system is still “far” from coalescence. In reality, one takes the waveform to be valid until the last stable circular orbit (LSCO). In the case of binaries consisting of two neutron stars, or a neutron star and a black hole, tidal effects might affect the evolution significantly before reaching the LSCO. However, this is likely to occur at frequencies well-above the sensitivity band of the current ground-based detectors, so that for all practical purposes post-Newtonian waveforms are a good approximation to low-mass (M < 10M ⊙) binaries. As elucidated in Section 6.5.2, progress in analytical and numerical relativity has made it possible to have a set of waveforms for the merger phase of compact binaries too. The computational cost in matched filtering the merger phase, however, will not be high, as there will only be on the order of a few 100 cycles in this phase. But it is important to have the correct waveforms to enhance signal visibility and, more importantly, to enable strong-field tests of general relativity.

In the general case of black-hole–binary inspiral the search space is characterized by 17 different parameters. These are the two masses of the bodies, their spins, eccentricity of the orbit, its orientation at some fiducial time, the position of the binary in the sky and its distance from the Earth, the epoch of coalescence and phase of the signal at that epoch, and the polarization angle. However, not all these parameters are important in a matched filter search. Only those parameters that change the shape of the signal, such as the masses, orbital eccentricity and spins, or cause a modulation in the signal due to the motion of the detector relative to the source, such as the direction to the source, are to be searched for and others, such as the epoch of coalescence and the phase at that epoch, are simply reference points in the signal that can be determined without any significant burden on computational power.

For binaries consisting of nonspinning objects that are either observed for a short enough period that the detector motion can be neglected, or last for only a short time in the sensitive part of a detector’s sensitivity band, there are essentially two search parameters – the component masses of the binary. It turns out that the signal manifold in this case is nearly flat, but the masses are curvilinear coordinates and are not good parameters for choosing templates. Chirp times, which are nonlinear functions of the masses, are very close to being Cartesian coordinates and template spacing is more or less uniform in terms of these parameters. Chirp times are post-Newtonian contributions at different orders to the duration of a signal starting from a time when the instantaneous gravitational-wave frequency has a fiducial value fL to a time when the gravitational wave frequency formally diverges and system coalesces. For instance, the chirp times τ 0 and τ 3 at Newtonian and 1.5 PN orders, respectively, are

5 1 τ0 = ---------(πM fL )−5∕3 , τ3 = -----(πM fL)−2∕3, (84 ) 256π νfL 8νfL
where M is the total mass and ν = m1m2 ∕M 2 is the symmetric mass ratio. The above relations can be inverted to obtain M and ν in terms of the chirp times:
( )2 ∕3 M = ---5--- τ3-, ν = ---1--- 32-πτ0 . (85 ) 32 π2fL τ0 8πfL τ3 5τ3

There is a significant amount of literature on the computational requirements to search for compact binaries [324Jump To The Next Citation Point146278Jump To The Next Citation Point280Jump To The Next Citation Point]. The estimates for initial detectors are not alarming and it is possible to search for these systems online. Searches for these systems by the LSC (see, for example, [8]) employs a hexagonal lattice of templates [119] in the two-dimensional space of chirp times. For the best LIGO detectors we need several thousand templates to search for component masses in the range [mlow, mhigh] = [1,100 ]M ⊙ [280Jump To The Next Citation Point]. Decreasing the lower-end of the mass range leads to an increase in the number of templates that goes roughly as m − 8∕3 low and most current searches [2Jump To The Next Citation Point6Jump To The Next Citation Point] only begin at mlow = 1M ⊙, with the exception of one that looked for black hole binaries of primordial origin [7], in which the lower end of the search was 0.2M ⊙.

Inclusion of spins is only important when one or both of the components is rapidly spinning [4197Jump To The Next Citation Point]. Spins effects are unimportant for neutron star binaries, for which the dimensionless spin parameter q, that is the ratio of its spin magnitude to the square of its mass, is tiny: 2 q = JNS ∕M NS ≪ 1. For ground-based detectors, even after including spins, the computational costs, while high, are not formidable and it should be possible to carry out the search on large computational clusters in real time [97]. Recently, the LSC has successfully carried out such a search [15].

5.1.3.2 Searching for Continuous Wave Signals.
In the case of continuous waves (CWs), the signal shape is pretty trivial: a sinusoidal oscillation with small corrections to take account of the slow spin-down of the neutron star/pulsar to account for the loss of angular momentum to gravitational waves and other radiation/particles. However, what leads to an enormous computational cost here is the Doppler modulation of the signal caused by Earth’s rotation, the motion of the Earth around the solar system barycenter and the moon. The number of independent patches that we have to observe so as not to lose appreciable amounts of SNR can be worked out in the following manner. The baseline of a gravitational wave detector for CW sources is essentially L = 2 × 1AU ≃ 3 × 1011 m. For a source that emits gravitational waves at 100 Hz, the wavelength of the radiation is λ = 3 × 106 m, and the angular resolution Δ 𝜃 of the antenna at an SNR of 1 is − 5 Δ 𝜃 ≃ λ ∕L = 10, or a solid angle of 2 −10 Δ Ω ≃ (Δ 𝜃) = 10. In other words, the number of patches one should search for is Npatches ∼ 4π∕ ΔΩ ≃ 1011. Moreover, for an observation that lasts for about a year (T ≃ 3 × 107 s) the frequency resolution is Δf = 1∕T ≃ 3 × 10−8. Searching over a frequency band of 300 Hz, around the best sensitivity of the detector, gives the number of frequency bins to be about 1010. Thus, it is necessary to search over roughly 1011 patches in the sky for each of the 1010 frequency bins. This is a formidable task and one can only perform a matched filter search over a short period (days/weeks) of the data or over a restricted region in the sky, or just perform targeted searches for known objects such as pulsars, the galactic center, etc. [92]. The severe computational burden faced in the case of CW searches has led to the development of specialized searches that look for signals from known pulsars [51012] using an efficient search algorithm that makes use of the known parameters [116Jump To The Next Citation Point151] and hierarchical algorithms that add power incoherently with the minimum possible loss in signal visibility [226434211Jump To The Next Citation Point]. The most ambitious project in this regard is the Einstein@Home project [153]. The goal here is to carry out coherent searches for CW signals using wasted CPUs on idle computers at homes, offices and university departments around the world. The project has been successful in attracting a large number of subscriptions and provides the largest computational infrastructure to the LSC for the specific search of CW signals and the first scientific results from such are now being published by the LSC [13].

5.1.3.3 χ2 veto.
Towards the end of Section 4.8 we discuss a powerful way of rejecting triggers, whose root cause is not gravitational wave signals but false alarms due to instrumental and environmental artifacts. In this section we will further quantify the χ2 veto [31Jump To The Next Citation Point] by using the scalar product introduced in the context of matched filtering. The main problem with real data is that it can be glitchy in the form of high amplitude transients that might look like damped sinusoids. An inspiral signal and a template employed to detect it are both broadband signals. Therefore, the matched-filter SNR for such signals has contributions from a wide range of frequencies. However, the statistic of matched filtering, namely the SNR, is an integral over frequency and it is not sensitive to contributions from different frequency regions. Imagine dividing the frequency range of integration into a finite number of bins fk ≤ f < fk+1, k = 1,...,p, such that their union spans the entire frequency band, f1 = 0 and fp+1 = ∞, and further that the contribution to the SNR from each frequency bin is the same, namely,

∫ fk+1 |&tidle;h(f )|2 4 ∫ ∞ |&tidle;h(f )|2 4 ------df = -- ------df. (86 ) fk Sh (f ) p 0 Sh(f )
Now, define the contribution to the matched filtering statistic coming from the k-th bin by [31Jump To The Next Citation Point]
∫ fk+1 ∗ ∗ df zk ≡ ⟨q,x⟩k ≡ 2 [q&tidle;(f )&tidle;x(f) + &tidle;q(f)&tidle;x (f)]------, (87 ) fk Sh (f)
where, as before, &tidle;x(f) and &tidle;q(f ) are the Fourier transforms of the detector output and the template, respectively. Note that the sum z = ∑ z k k gives the full matched filtering statistic [31]:
∫ ∞ ∗ ∗ -df--- z = ⟨q,x⟩ ≡ 2 [&tidle;q (f)&tidle;x(f) + &tidle;q(f)&tidle;x (f )] S (f). (88 ) 0 h

Having chosen the bins and quantities zk as above, one can construct a statistic based on the measured SNR in each bin as compared to the expected value, namely

∑p ( z )2 χ2 = p zk − -- . (89 ) k=1 p
When the background noise is stationary and Gaussian, the quantity 2 χ obeys the well-known chi-square distribution with p − 1 degrees of freedom. Therefore, the statistical properties of the χ2 statistic are known. Imagine two triggers with identical SNRs, but one caused by a true signal and the other caused by a glitch that has power only in a small frequency range. It is easy to see that the two triggers will have very different 2 χ values; in the first case the statistic will be far smaller than in the second case. This statistic has served as a very powerful veto in the search for signals from coalescing compact binaries and it has been instrumental in cleaning up the data (see, e.g., [26]).
  Go to previous page Go up Go to next page