Go to previous page Go up Go to next page

4.11 Non-stationary, non-Gaussian, and non-linear data

Equations (34View Equation) and (35View Equation) provide maximum likelihood estimators only when the noise in which the signal is buried is Gaussian. There are general theorems in statistics indicating that the Gaussian noise is ubiquitous. One is the central limit theorem which states that the mean of any set of variates with any distribution having a finite mean and variance tends to the normal distribution. The other comes from the information theory and says that the probability distribution of a random variable with a given mean and variance which has the maximum entropy (minimum information) is the Gaussian distribution. Nevertheless, analysis of the data from gravitational-wave detectors shows that the noise in the detector may be non-Gaussian (see, e.g., Figure 6 in [13]). The noise in the detector may also be a non-linear and a non-stationary random process.

The maximum likelihood method does not require that the noise in the detector is Gaussian or stationary. However, in order to derive the optimum statistic and calculate the Fisher matrix we need to know the statistical properties of the data. The probability distribution of the data may be complicated, and the derivation of the optimum statistic, the calculation of the Fisher matrix components and the false alarm probabilities may be impractical. There is however one important result that we have already mentioned. The matched-filter which is optimal for the Gaussian case is also a linear filter that gives maximum signal-to-noise ratio no matter what is the distribution of the data. Monte Carlo simulations performed by Finn [39] for the case of a network of detectors indicate that the performance of matched-filtering (i.e., the maximum likelihood method for Gaussian noise) is satisfactory for the case of non-Gaussian and stationary noise.

Allen et al. [7Jump To The Next Citation Point8Jump To The Next Citation Point] Update derived an optimal (in the Neyman–Pearson sense, for weak signals) signal processing strategy when the detector noise is non-Gaussian and exhibits tail terms. This strategy is robust, meaning that it is close to optimal for Gaussian noise but far less sensitive than conventional methods to the excess large events that form the tail of the distribution. This strategy is based on an locally optimal test ([53]) that amounts to comparing a first non-zero derivative Λn,

dnΛ (x|ε) Λn = ---dεn---|ε=0 (90 )
of the likelihood ratio with respect to the amplitude of the signal with a threshold instead of the likelihood ratio itself.

In the remaining part of this section we review some statistical tests and methods to detect non-Gaussianity, non-stationarity, and non-linearity in the data. A classical test for a sequence of data to be Gaussian is the Kolmogorov–Smirnov test [27Jump To The Next Citation Point]. It calculates the maximum distance between the cumulative distribution of the data and that of a normal distribution, and assesses the significance of the distance. A similar test is the Lillifors test [27], but it adjusts for the fact that the parameters of the normal distribution are estimated from the data rather than specified in advance. Another test is the Jarque–Bera test [51] which determines whether sample skewness and kurtosis are unusually different from their Gaussian values.

Let xk and ul be two discrete in time random processes (− ∞ < k,l < ∞) and let ul be independent and identically distributed (i.i.d.). We call the process xk linear if it can be represented by

∑N x = a u , (91 ) k l k−l l=0
where al are constant coefficients. If ul is Gaussian (non-Gaussian), we say that xl is linear Gaussian (non-Gaussian). In order to test for linearity and Gaussianity we examine the third-order cumulants of the data. The third-order cumulant C kl of a zero mean stationary process is defined by
C := E [x x x ]. (92 ) kl m m+k m+l
The bispectrum S2 (f1,f2) is the two-dimensional Fourier transform of Ckl. The bicoherence is defined as
B (f ,f ) :=------S2(f1,f2)------, (93 ) 1 2 S (f1 + f2 )S (f1)S(f2)
where S(f) is the spectral density of the process xk. If the process is Gaussian then its bispectrum and consequently its bicoherence is zero. One can easily show that if the process is linear then its bicoherence is constant. Thus if the bispectrum is not zero, then the process is non-Gaussian; if the bicoherence is not constant then the process is also non-linear. Consequently we have the following hypothesis testing problems:
  1. H1: The bispectrum of xk is nonzero.
  2. H 0: The bispectrum of x k is zero.

If Hypothesis 1 holds, we can test for linearity, that is, we have a second hypothesis testing problem:

  1. H ′1: The bicoherence of xk is not constant.
  2. ′′ H 1: The bicoherence of xk is a constant.

If Hypothesis 4 holds, the process is linear.

Using the above tests we can detect non-Gaussianity and, if the process is non-Gaussian, non-linearity of the process. The distribution of the test statistic B (f1,f2), Equation (93View Equation), can be calculated in terms of 2 χ distributions. For more details see [45].

It is not difficult to examine non-stationarity of the data. One can divide the data into short segments and for each segment calculate the mean, standard deviation and estimate the spectrum. One can then investigate the variation of these quantities from one segment of the data to the other. This simple analysis can be useful in identifying and eliminating bad data. Another quantity to examine is the autocorrelation function of the data. For a stationary process the autocorrelation function should decay to zero. A test to detect certain non-stationarities used for analysis of econometric time series is the Dickey–Fuller test [25]. It models the data by an autoregressive process and it tests whether values of the parameters of the process deviate from those allowed by a stationary model. A robust test for detection non-stationarity in data from gravitational-wave detectors has been developed by Mohanty [69]. The test involves applying Student’s t-test to Fourier coefficients of segments of the data.


  Go to previous page Go up Go to next page