### 4.11 Non-stationary, non-Gaussian, and non-linear data

Equations (34) and (35) provide maximum likelihood estimators only when the noise in which the signal is buried is Gaussian. There are general theorems in statistics indicating that the Gaussian noise is ubiquitous. One is the central limit theorem which states that the mean of any set of variates with any distribution having a finite mean and variance tends to the normal distribution. The other comes from the information theory and says that the probability distribution of a random variable with a given mean and variance which has the maximum entropy (minimum information) is the Gaussian distribution. Nevertheless, analysis of the data from gravitational-wave detectors shows that the noise in the detector may be non-Gaussian (see, e.g., Figure 6 in [13]). The noise in the detector may also be a non-linear and a non-stationary random process.

The maximum likelihood method does not require that the noise in the detector is Gaussian or stationary. However, in order to derive the optimum statistic and calculate the Fisher matrix we need to know the statistical properties of the data. The probability distribution of the data may be complicated, and the derivation of the optimum statistic, the calculation of the Fisher matrix components and the false alarm probabilities may be impractical. There is however one important result that we have already mentioned. The matched-filter which is optimal for the Gaussian case is also a linear filter that gives maximum signal-to-noise ratio no matter what is the distribution of the data. Monte Carlo simulations performed by Finn [39] for the case of a network of detectors indicate that the performance of matched-filtering (i.e., the maximum likelihood method for Gaussian noise) is satisfactory for the case of non-Gaussian and stationary noise.

Allen et al. [78]  derived an optimal (in the Neyman–Pearson sense, for weak signals) signal processing strategy when the detector noise is non-Gaussian and exhibits tail terms. This strategy is robust, meaning that it is close to optimal for Gaussian noise but far less sensitive than conventional methods to the excess large events that form the tail of the distribution. This strategy is based on an locally optimal test ([53]) that amounts to comparing a first non-zero derivative ,

of the likelihood ratio with respect to the amplitude of the signal with a threshold instead of the likelihood ratio itself.

In the remaining part of this section we review some statistical tests and methods to detect non-Gaussianity, non-stationarity, and non-linearity in the data. A classical test for a sequence of data to be Gaussian is the Kolmogorov–Smirnov test [27]. It calculates the maximum distance between the cumulative distribution of the data and that of a normal distribution, and assesses the significance of the distance. A similar test is the Lillifors test [27], but it adjusts for the fact that the parameters of the normal distribution are estimated from the data rather than specified in advance. Another test is the Jarque–Bera test [51] which determines whether sample skewness and kurtosis are unusually different from their Gaussian values.

Let and be two discrete in time random processes () and let be independent and identically distributed (i.i.d.). We call the process linear if it can be represented by

where are constant coefficients. If is Gaussian (non-Gaussian), we say that is linear Gaussian (non-Gaussian). In order to test for linearity and Gaussianity we examine the third-order cumulants of the data. The third-order cumulant of a zero mean stationary process is defined by
The bispectrum is the two-dimensional Fourier transform of . The bicoherence is defined as
where is the spectral density of the process . If the process is Gaussian then its bispectrum and consequently its bicoherence is zero. One can easily show that if the process is linear then its bicoherence is constant. Thus if the bispectrum is not zero, then the process is non-Gaussian; if the bicoherence is not constant then the process is also non-linear. Consequently we have the following hypothesis testing problems:
1. : The bispectrum of is nonzero.
2. : The bispectrum of is zero.

If Hypothesis 1 holds, we can test for linearity, that is, we have a second hypothesis testing problem:

1. : The bicoherence of is not constant.
2. : The bicoherence of is a constant.

If Hypothesis 4 holds, the process is linear.

Using the above tests we can detect non-Gaussianity and, if the process is non-Gaussian, non-linearity of the process. The distribution of the test statistic , Equation (93), can be calculated in terms of distributions. For more details see [45].

It is not difficult to examine non-stationarity of the data. One can divide the data into short segments and for each segment calculate the mean, standard deviation and estimate the spectrum. One can then investigate the variation of these quantities from one segment of the data to the other. This simple analysis can be useful in identifying and eliminating bad data. Another quantity to examine is the autocorrelation function of the data. For a stationary process the autocorrelation function should decay to zero. A test to detect certain non-stationarities used for analysis of econometric time series is the Dickey–Fuller test [25]. It models the data by an autoregressive process and it tests whether values of the parameters of the process deviate from those allowed by a stationary model. A robust test for detection non-stationarity in data from gravitational-wave detectors has been developed by Mohanty [69]. The test involves applying Student’s t-test to Fourier coefficients of segments of the data.