Inversion of Covariance Matrix for High Dimension Data
DOI : 10.3844/jmssp.2011.227.229
Journal of Mathematics and Statistics
Volume 7, Issue 3
Problem statement: In the testing statistic problem for the mean vector of independent and identically distributed multivariate normal random vectors with unknown covariance matrix when the data has sample size less than the dimension n≤p, for example, the data came from DNA microarrays where a large number of gene expression levels are measured on relatively few subjects, the p×p sample covariance matrix S does not have an inverse.. Hence any statistic value involving inversion of S does not exist. Approach: In this study, we showed a version of some modification on S, S+cI and find a real smallest value c≠0 which makes (S + cI)-1 exist. Results: The result from study provided when the dimension p tends to infinity and smallest change in S, the (S + cI)-1 do exist when c = 1. Conclusion: In statistical analysis involving with high dimensional data that an inversion of sample covariance matrix do not exist, one way to modify a sample covariance matrix S to have an inverse is to consider a sample covariance matrix, S, as the form S + cI and we recommend to choose c = 1.
© 2011 Samruam Chongcharoen. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.