On the Probability Density Function of the Test Statistic for One Nonlinear GLR Detector Arising from fMRI

Recently an important and interesting nonlinear generalized likelihood ratio (GLR) detector emerged in functional magnetic resonance imaging (fMRI) data processing. However, the study of that detector is incomplete: the probability density function (pdf) of the test statistic was draw from numerical simulations without much theoretical support and is therefore, not firmly grounded. This correspondence presents more accurate (asymptotic) closed form of the pdf by resorting to a non-central Wishart matrix and by asymptotic expansion of some integrals. It is then confirmed theoretically that the detector does possess constant false alarm rate (CFAR) property under some practical regimes of signal to noise ratio (SNR) for finite samples and the correct threshold selection method is given, which is very important for real fMRI data processing.


INTRODUCTION
In our research on voxelwise detection of fMRI, we encountered the following model and problem [1] . Let x denote an N  . The second component br is a real oscillating signal. The vector r is a reference function that models the expected response characteristic of activation pattern. We assume r to be known throughout the study. The third component of additive complex Gaussian white noise c n σ models errors primarily due to thermal noises in the patient. The term n c denotes a standard complex Gaussian vector (with mean zero and covariance matrix N N I × ). In general, a, b, ϑ and 2 σ are unknown scalars and are different for each voxel timeseries. The detection is with regard to b: under H 0 , b=0 indicates there is no brain activity in the corresponding voxel; under H 1 , b 0 ≠ indicates the presence of activity in the voxel. We have compared this model to actual fMRI time-series and found that our assumptions are in good agreement with actual data [1] .
As in [1] , we express the complex model as a 2N The subscripts R and I denote real and imaginary parts, respectively. The detector was systematically studied in [1] . Since its publication, [1] has received many citations, including but not limited to [2,3] . In particular, our model and methods were also validated and generalized by some other work [2] . Readers are strongly suggested to read [1] , because many practical issues (like the orthognality of 1 and r to be mentioned and numerical results-tables and curves illustrating our detector's performance) are contained in [1] . While the methodology and main conclusion of [1] are still valid, it is defective in one aspect: the closed form of the test statistic's pdf was achieved by numerical simulations (for only one fixed sample length N=120) and thus not firmly grounded. Because of the fundamental importance of threshold selection in detection problems and the above-mentioned importance of [1] in fMRI data processing, it is worthwhile to do some rigorous theoretical work to fix the defect.
Due to inherent difficulty in dealing with nonlinear problems, we first attack a simpler problem with known noise variance. In the second part, we attack the more complex problem with unknown noise variance. The second problem is more realistic and is our ultimate goal. However, as we will see, the study on the first part provides a lot of insight and serves as a very useful basis for the second part.
Before specializing in individual case, we explicitly write the pdf of observation data y, which will be used in both cases. It is: with Θ standing for unknown patameters. We further make some definitions and matrix decompositions, for subsequent use.
The decomposition of two projection matrices P [SH] and [SH] P ⊥ follows from this fact: suppose n n P × is a projection matrix with rank r. Then there exists Matrix analysis theory [4] can be used to justify this fact. Note 2N  , , θ θ θ are independent, jointly Gaussian.
Following [1] , we assume T T 1 r 0 and r r N. = = We thus have some preliminary results which will be used very often, like , , θ θ θ are defined in equation (4).

Results on known noise variance GLR test statistic:
Recall GLR test statistic is given by: Everything is straightforward except for the minimization in (5). Following the spirit in Appendix I of [1] to achieve the optimization, we have the result: Probability density function of test statistic: Direct calculation of the pdf of test statistics l 2 (y) from equation (6) is formidable. Instead we simplify it to a form in terms of normalized quantities.
Where we have introduced the following normalized random vectors to facilitate computation Also, although 3n θ is not immediately used here, for sake of comparison and unity they are prepared for use later together with 1n 2n ,and θ θ . The subscript n means normalized. Also notice the independence between 1n 2n , θ θ and 3n θ .

Introduction of a non-central Wishart matrix:
To compute the pdf of l 2 (y), we further define a matrix and another matrix   2  2n  2n  1n  11  12  T  2 2  2  2  21  22  1n 2n 1n × ω denotes a non-central Wishart distribution [5] with noncentrally matrix ω , The pdf of matrix A is explicitly given by [ Partially, this is good news: in theory, equation (9) provides a basis for all other pdf's in question. However, because of the very cumbersome hypergeometric function, equation (9) is still very intractable, particularly under H 1 . The difficulty of our problem lies in the noncentral Wishart distribution, about which little is known. Derivation of the pdf of l 2 (y) from equation (7) is still difficult. To bypass the difficulty, we examine the two eigenvalues 1 2 0 λ > λ > of matrix A.
Using these relationships, the test statistic l 2 (y) is greatly simplified to The joint pdf of 1 λ , 2 λ and 11 a of matrix A under H 0 , is critically useful both in this section and in next section.
Joint pdf of two eigenvalues and a 11 of matrix A under H 0 : Under H 0 , µ=0, the pdf of noncentral Wishart matrix A in equation (9) is simplified to [6] There are two terms in the above ∑ operation.
Closed form of pdf of l 2 (y) at finite samples: One can perform the standard procedure to calculate the pdf of l 2 (y) under H 0 and the exact closed form turns out to be still quite complicated and is omitted here. It shows that, for finite samples and very small signal to noise ratio, our detector generally doesn't have the property of constant false alarm rate (CFAR). Hence, selection of proper threshold to satisfy a prescribed probability of false alarm P f is impossible in general, because it is dependent on unknown parameters. To proceed, let us take a short-cut from another perspective. It is shown in [7] (asymptotic distribution of GLR test statistics) that l (y)~χ , for large samples (N ) → ∞ . Here asymptotic analysis is a statistical term, referring to the behavior of some quantity for large data samples. However, large samples assumption is not feasible in fMRI problem, due to the temporal resolution limitation in fMRI [8] . How can we then design an implementable detector for finite samples? To get around this difficulty, we combine result from previous subsection and asymptotic result. We notice one feature, that is, the pdf of l 2 (y) (under H 0 ) depends on unknown parameters through 2 2 Na δ = σ . For this parameter, the contributions from N and from 2 (a / ) σ are equivalent. Therefore, with fixed finite samples but at some reasonably large signal to noise ratio SNR δ , it must also have CFAR property.
In practice, we can do numerical simulations to probe some SNR region in which the detector is practically, locally with 2 2 0 1 l (y) H~. χ And if that SNR region is commensurate with real application settings, our detector is implement able and is CFAR in that region and we should be content. Invariance property can save us a lot of numerical work in this aspect. Since [1] already elaborated the numerical results for the case of unknown noise variance, which is our ultimate goal, we omit corresponding work for this similar part.

Results on unknown noise variance GLR test statistic:
The GLRT statistic is given by . min σ = σ Following [1] , instead of using l 4 (y) directly, we use equation (14) as our test statistic.
The detailed derivation was already published in [1] .
Similar to previous section, we further use the normalized quantities in lieu of their counterparts, the very complicated form is then drastically simplified into one in terms of eigenvalues (when the noise of variance 2 σ is known, no one would doubt feasibility of introduction of the normalized quantities, since they are not only beneficial to mathematical computation, they are also physically implementable. In present case of unknown noise variance, the normalized quantities are not implementable, introduction of them is purely to benefit mathematical calculation). 2  2  2  2 2  2  1n  2n  1n  2n  1n  2n  3  2  2  2  2  2  2  2  1n  2n  1n  2n  1n  2n  3   2  1  2 n  2  2  2  3 3  2 3 n a~− θ χ is independent of the other three, Now we are going to do some asymptotic expansion to the above integral. Here, asymptotic expansion is completely a mathematical term, referring to the behavior of some integral when some parameter is very large. Specifically, we will use Laplace method for asymptotic expansion several times in this section. The reader is referred to some mathematical book [9] for more exposition on Laplace method and other methods for asymptotic expansions.
Repeating Laplace method for asymptotic expansion, we have Step 2: Now we will prove, asymptotically (for large δ ), 1
Step 3: We are to prove that 11 2 a ,λ are also independent (asymptotically), which is much easier than the above derivation. From equation (13) and since 1 2 λ >> λ which is just established, In [1] , extensive numerical simulations reveals that the GLR test is also CFAR when a 1, ≥ σ which is the case for most if not all fMRI experiments. More importantly and more interestingly, to achieve the desired P f , the proper threshold of our GLR detector is almost exactly one half that of the corresponding threshold required for a 1,( N 1) F − distributed test statistic.
Equivalently this means that The discrepancy between result in [1] and theoretical result here is easily explained. The numerical simulations were only performed for N=120 in [1] . In this case 1,2 N 3 1,N 1 If N gets much smaller (in either fMRI or coherent interference detection to be discussed), theoretical instead to numerical result need be used.

DISCUSSION AND CONCLUSION
This study provides an asymptotic, closed form of the pdf of the test statistic of one non linear GLRT detector with important applications to fMRI, thereby a more accurate threshold selection method is established for finite samples. This is very important for practical fMRI data processing.
As pointed out in introduction, our model is the simplest, but it leaves room for extension to deal with more complex situations such as trends, physiological fluctuations (respiration and cardiac cycle), patient motions etc. Color noise can be whitened provided we know data correlation matrix. Practical issues like these are already mentioned in [1] . Solution of the nonlinear problems with general representation of signal subspace and nuisance subspace H and S will be presented in the future.
Nomenclature: Scalors, vectors and matrices are not distinguished notationally (except in the very beginning part of introduction of the model). In other words, all quantities in the study are regarded as matrices (of commensurate dimensions), following Matlab convention.