An Application of Univariate Statistics to Hotelling ’ s T

Abstract: Problem statement: Hotelling’s T statistic has been well documented in the existing literature and exact as well as asymptotic results have been obtained. In the present article, we focus on an important particular case of T and we note that, there is no clear way to show, how univariate results, used in the theory of Student’s t statistic, could be used to derive corresponding multivariate ones, for T . Therefore, our goal is to find an alternative method, which would be more useful than the usual one, in order to generalize directly univariate theory. Approach: At first, we used some matrix tools in order to obtain an equivalent algebraic form for T and, then, we applied some univariate results concerning distributions which arise from the normal distribution. Results: We found an algebraic representation of T, which can be conceived as the natural extension of some results appearing in the literature and, we used our findings to show how standard univariate techniques can be applied in order to derive the exact and limiting distributions of T. Conclusion: Using the proposed representation of T gives a better insight on the generalization of univariate results to multivariate analyses and indicates, at the same time, an alternative way to prove typical multivariate results. Furthermore, it allows for usual theoretical calculations to simplify.


INTRODUCTION
The T 2 statistic was derived by Hotelling (1931), as a multivariate generalization of Student's t and has been very well documented over the years. Hsu (1938) derived the distribution of T 2 in the case where the null hypothesis is not true. Bowker (1960) derived a representation of T 2 as the ratio of two independent x 2 variables, the numerator being non central. This has, also, been described in details in Anderson (1984).
In the present study, we focus on a very important special case, which is as follows (see also Anderson (1984), Chapter 5). Let us first take a random vector sample X 1 ,…, X r , of dimension p, which is normally distributed and consider 2 -1 r T r(X ) M (X ) where X is the mean vector and M r the sample covariance matrix. It is well known that T 2 is used for testing hypotheses about the population mean vector μ. It is also interesting to note that, under the null hypothesis that μ = μ 0 , it has been shown that T 2 = T 2 0 is a function of the likelihood ratio test λ 0 , since 2 2 / r 0 0 T (r 1)( 1) − = − λ − . (Anderson (1984), U9).
We need to mention, at this point, the great use of the aforementioned statistic in diverse practical applications arising from many different fields. For instance, a recent article concerning marketing issues was published by Agarwall and Dey (2010) in order to assess the level of air travellers' satisfaction among Indian domestic airlines. Another recent example arises from the field of Insurance, where Momani et al. (2010) use simple and multiple regression techniques in order to examine the effect of different factors on the capital structure of insurance sector companies listed in the Amman stock market. Finally, a third indicative example concerns the use of a simple version of T 2 , in an analysis of data from meteorological stations in order to examine climate change in Jordan (Hamdi et al., 2009).
An important result is given by Corollary 5.2.1. of Anderson (1984), which gives the distribution of T 2 and which also allows for the case where the null hypothesis is not true. We now state Corollary 5.2.1., which we call Theorem 1, using our notation.
Theorem 1: Let X 1 ,…, X r be a sample from N(μ,Σ) and let In the present study, we need to extend the algebraic calculations appearing in Anderson (1984), U12 and to show that, in the multivariate case, that is the case where p>1, T 2 can be put into a special algebraic form (representation), as a direct generalization of the univariate case p = 1. Under this new representation, we will revisit Theorem 1, for the multivariate case and we shall examine how univariate results can be used to infer about T 2 . We shall focus on this methodology, since, according to our knowledge, it does not appear anywhere in the existing literature.
Furthermore, we consider the case where r→∞, using, once again, our algebraic representation for T 2 . The corresponding asymptotic version of Theorem 1 is given by Theorem 5.2.3., of Anderson (1984), which we call Theorem 2. We now state briefly Theorem 2, using our notation see also (Polymenis, 2008).
In this case, we need to derive the limiting distribution of T 2 and to show, in the same spirit as before, how results for p=1 can be directly used to obtain results for p>1. Thus, we revisit Theorem 2, using our method. We, again, focus on using univariate results to obtain corresponding multivariate ones.

MATERIALS AND METHODS
A specific representation of T 2 : We use here first, an algebraic approach similar to the one used by Anderson (1984), U15 in order to derive the distribution of T 2 , at first under the null hypothesis. For reasons of simplicity, we first centre the X i vectors, so that the null hypothesis becomes μ = μ 0 = 0. Furthermore, we choose the covariance matrix of rX to be l, the Identity matrix, since this simplification will not affect the distribution of T 2 (see Anderson (1984)). The covariance matrix of X i can, consequently, be chosen to be l, since it is equal to the covariance matrix of rX . We, then, proceed by considering an orthogonal (pxp) matrix Q = (q ij ) such that: r q X = r( X +...+ X X X rX X X )= = rX X T X X X X = = ∑ and, for j ≠ 1, we have: since Q is orthogonal. Thus, we have that: where, U = (u 1 , 0,…,0) T and: Thus: 2 2 11 1 T u b r 1 = − Set B to be the inverse matrix of B -1 , write B = (b ij ) and partition B as: (1) We partition B −1 in the same way: Graybill (1969), we obtain: This is a result from the theory of partitioned matrices.
Let us now set: where the e i,β 's are the entries of the matrix E and (1) ( So that, letting: However, on the one hand, (F -1 ) T H -1 F -1 = I (p-1,p-1) and , on the other hand, E 1 (E 2 ) T is the ((r-p+1) × (p-1)) matrix [0] because E is orthogonal and E 2 (E 2 ) T = I (p-1,p-1) . For simplicity reasons, we write I (p-1,p-1) = I. Thus, we obtain: Since, as aforementioned: it results that: for all r>p-1 and, consequently:

Hence, Theorem 3 obtains.
Properties of the numerator and the denominator of T 2 : We, first, present three useful preliminary lemmas. , we obtain that: On the other hand, since q 11 , q 12 ,…, q 1p are the elements of the first row of the (p×p) matrix Q, which is orthogonal, it result that: . Then, using theorem with proof (section 4.10, Grimmett and Stirzaker (2001)), applied to the independent, N(0,1) random variables ( j) i X , we obtain that distributed. Finally, letting n = r -p+1, we obtain that We conclude that r p 1 (1) 2 i i 1 (V ) − + = ∑ is also χ 2 (r-p) distributed and independently of Q. N this is in accordance with a remark in the proof of Theorem 5.2.2, of Anderson (1984), stating that "since the conditional distribution of b 11.2,…p does not depend on Q, it is unconditionally distributed as x 2 ".

Lemma 2:
Under the null hypothesis, T rX X is x 2 (p) distributed.

Proof of Lemma 2:
Since we know that:

Proof of lemma 3:
We first note that this lemma is equivalent to the application of Lemma 1 to expression (3.3), mentioned in Bowker (1960). We will now elaborate on that, using the result of Theorem 3. Let us call V the vector: (1) (2) (p) r p 1 r p 1 q (X X ) q (X X ) ... q (X X ) q (X X ) q (X X ) ... q (X X ) . .
q (X X ) q (X X ) ... q (X X ) ) = 0, which shows independency between numerator and denominator of student's t. (See, also, Grimmett and Stirzaker (2001), section 4.10, for a comprehensive analysis of the univariate case.) We, now, present a useful lemma, concerning the limiting properties of the denominator of T 2 .

Proof of Lemma 4:
We first prove Lemma 4, for p = 1, since we will use its results in the proof of Lemma 4, for p>1. In this case, T 2 = t 2 , the denominator of t 2 is and, we first need to show that, as r→∞, ( Var(X X) ) (r 1) 1 2 2 2 r 2 2/r (r(2 )) (r 1) r r (r 1) Hence, (1)→0 as r→∞. We, then, calculate We have that:

RESULTS
The exact distribution of T 2 : We, first, give two important remarks, based on previous calculations.
T 2 , as presented in Theorem 3, is a generalization of the univariate case, p = 1, with (1) i i 1 1 V X X, q =1 = − and q 1j = 0 for j ≠ 1. In this case, T 2 equals t 2 (the square of student's t) and: We now prove Theorem 1, using the result of Theorem 3.
Proof of Theorem 1: From Lemmas 1, 2, 3, under the null hypothesis H 0 , T 2 is distributed as the ratio of two independent χ 2 's. Then, it is well known, by definition of the F statistic (see, for instance, Grimmett and Stirzaker (2001), section 4.10), that: In the case where the null hypothesis is not true, we use the same rationale as before. Let us consider: with the denominator being same as before. Following notation as in Anderson (1984), p. 161, set distributed and, consequently, the numerator of T 2 has a non central x 2 distribution. The denominator of T 2 is the same as for the case where H 0 was true, so that Lemma 1 holds. Lemma 3 holds too, since, taking 0 X − μ , instead of X , does not affect independency between numerator and denominator of T 2 . Finally, since the numerator of T 2 has a non central x 2 distribution, 2 T r p ( ) r 1 p − − is non central F. This result proves Theorem 1, when H 0 is not true.

The limiting distribution of T 2 :
We now give two important remarks, concerning the limiting distribution of T 2 : • In the univariate case, we know that, from Lemma 2, 2 rX is x 2 (1) distributed. Then, using Lemma 4, we conclude that, under H 0 , as r→∞, t 2 tends in distribution to x 2 (1) • Since, from Lemma 2, T 2 rX X (p) χ ∼ , we use Lemma 4, in the same way as for p = 1 and we conclude that, under H 0 , as r→∞, T 2 tends in distribution to x 2 (p). We also note that, if we replace p by 1, then 2 2 2 T t (1) = →χ , in distribution, as r→∞, which coincides with the previous remark We now prove Theorem 2, using the result of Theorem 3.

Proof of Theorem 2:
The limiting distribution of T 2 , mentioned in the above remarks, still remains valid, under H 0 , even if the random vector sample X 1 ,…X r is not normally distributed. Indeed, in this case, it is sufficient to apply a multivariate central-limit theorem (Theorem 3.4.3 anderson (1984)), stating that the limiting distribution, as r→∞, of rX is N(0,I) and, hence, we conclude that the limiting distribution of T rX X is x 2 (p). On the other hand, the asymptotic results of Lemma 4 still hold. Thus, using Proposition 6.3.8., of Brockwell and Davis (1991), for k = 1, we conclude that the limiting distribution of T 2 is x 2 (p). This result proves Theorem 2.

DISCUSSION
In the present article, we found an alternative form (representation) for an important particular case of Hotelling's T 2 , which extends Bowker (1960) results, in order to prove usual multivariate exact and limiting theorems, using standard univariate statistics. We remark, that, for the univariate case p = 1, the usual form of T 2 and the new representation of T 2 , given by Theorem 3, have exactly the same algebraic configurations and are both equal to the usual expression of t 2 . In this case, exact and asymptotic properties of t 2 are easily derived. However, for the more complicated multivariate case, Theorem's 3 result has the obvious advantage, over the usual form of T 2 , of using directly simple univariate results, obtained for p = 1, in order to derive exact and asymptotic statistical properties for T 2 .

CONCLUSION
An important and very well documented univariate distribution, arising from the normal distribution, is Student's t. Our study showed how exact and asymptotic results, concerning t, can help deriving corresponding properties for a well known multivariate analogue of the square of t, namely Hotelling's T 2 , by using a specific algebraic representation of T 2 . In other words, in the present article, standard univariate material was used in a quite straightforward manner for explaining multivariate theory. As a result, usual multivariate theoretical calculations are considerably simplified.