Probabilistic Estimates of the Largest Strictly Convex Singular Values of Pregaussian Random Matrices

Article history Received: 28-02-2015 Revised: 27-03-2015 Accepted: 30-03-2015 Abstract: In this study, the p-singular values of random matrices with Gaussian entries defined in terms of the lp-p-norm for p>1, as is studied. Mainly, using analytical techniques, we show the probabilistic estimate, precisely, the decay, on the upper tail probability of the largest strictly convex singular values, when the number of rows of the matrices becomes very large and the lower tail probability of theirs as well. These results provide probabilistic description or picture on the behaviors of the largest p-singular values of random matrices in probability for p>1. Also, we show some numerical experiential results, which verify the theoretical results.


Introduction
The largest singular value and the smallest singular value of random matrices in l 2 -norm, including Gaussian random matrices, Bernoulli random matrices, subgaussian random matrices, etc, have attracted major research interest in recent years and have applications in compressed sensing, a technique for recovering sparse or compressible signals. For instance, (Soshnikov, 2002;Soshnikov and Fyodorov, 2004) studied the largest singular value of random matrices and (Rudelson and Vershynin, 2008a;2008b; and some others, studied the smallest singular values. In the study of the asymptotic behavior of eigenvalues of symmetric random matrices, Wigner symmetric matrix is a typical example, whose upper (or lower) diagonal entries are independent random variables with uniform bounded moments. Wigner proved in (Wigner, 1958) that the normalized eigenvalues are asymptotically distributed in the semicircular distribution. Precisely, let A be a symmetric gaussian random matrix of size n×n whose upper diagonal entries are independent and identicallydistributed copies of the standard gaussian random variable, then the empirical distribution function of the eigenvalues of 1 n A is asymptotically: As the matrix size n goes to infinity. This is the wellknown Wigner's Semicircle law, which provides the precise description of the statistical behavior of eigenvalues of matrix of large size. In another case, for a random matrix whose entries are independent and identically-distributed (i.i.d.) copies of a complex random variable with mean 0 and variance 1, Tao and Vu, (2008; that the eigenvalues of 1 n a converges to the uniform distribution on the unit circle as n goes to ∞ and that holds not only for the random matrices with real entries but also for complex entries. Their result has also generalized (Girko, 1985) and solved the circular law conjecture open since the 1950's, that the smallest eigenvalue converges to the uniform distribution over the unit disk as n tends to infinity (Bai, 1997).
The largest singular values of matrices are actually their p-norm, which, from a geometric perspective, has connectionsa with the Minkowski space, complex l p space, in differential geometry, for which one can refer to (Liu, 2013;2011), because one can view the p-norm of a matrix as a generalization of the p-norm of a vector.
For random matrices whose entries are i.i.d. random variable satisfying certain moment conditions, the largest singular value was studied in (Geman, 1980;Yin et al., 1988). Tracy and Widom (1996) that the limiting law of largest eigenvalue distributions of Gaussian Orthogonal Ensemble (GOE) is given in terms of a particular Painlevé II function, which is the well-known Tracy-Widom law. Furthermore, the distribution of the eigenvalue of Wishart matrices, W N,n = AA*, where A = A N,n is a Gaussian random matrix of size N×n, was studied in (Johansson, 2000;Johnstone, 2001). They showed that the distribution of largest eigenvalue of Wishart matrices converges to the Tracy-Widom law as n N tend to some positive constant. More generally, the non-gaussian random matrices were studied in (Soshnikov, 2002). Seginer (2000) compared the Euclidean operator norm of a random matrix with i.i.d. mean zero entries to the Euclidean norm of its rows and columns. Later, (Latala, 2005) gave the upper bound on the expectation (or average value) of largest singular value namely the norm of any random matrix whose entries are independent mean zero random variables with uniformly bounded fourth moment.
The condition number, which is the ratio of the largest singular value over the smallest singular value of a matrix, is critical to the stability of linear systems. In (Edelman, 1988), the distribution of the condition number of Gaussian random matrices, was particularly investigated in numerical experiments. As a typical example of subgaussian random matrices, the invertibility of Bernoulli random matrices was also studied. Tao and Vu (2007) the probability of Bernoulli random matrices to be singular is shown to be at most ( ) , where n is the size of the matrices.
Their result shows that the probability of the smallest singular value of Bernoulli random matrices to be zero is exponentially small as n tends to infinity. Recently, the singularity probability ( ) has been improved to ( ) by (Bourgain et al., 2010).
The recent studies of the smallest singular value have also been motivated, in a large sense, by some open questions or conjectures. Spielman and Teng (2002) the following conjecture was proposed in the International Congress of Mathematicians in 2002.

Conjecture 1.1
Let ξ be Bernoulli random variable, in other words, for all t>0 and some 0<c<1.
In the breakthrough work on the estimate on the smallest singular value, (Rudelson and Vershynin, 2008a), Rudelson and Vershynin obtained the upper tail probabilistic estimate on the smallest value in l 2 -norm for square matrices of centered random variables, with unit variance and appropriate moment assumptions. In particular, they proved the Spielman-Teng conjecture up to a constant. The lower tail probabilistic estimate on the smallest value in l 2 -norm for square matrices was estimated in (Rudelson and Vershynin, 2008b). These results have shown that the smallest singular value of the n×n subgaussian random matrices is of order 1 2 n − in high probability for large n. In a more explicit way, the distribution of the smallest singular value of random was given in  by using property testing from combinatorics and theoretical computer science. The pregaussian matrices were used to recover sparse image in (Rauhut, 2010) and matrix recovery, on which one can refer to (Oymak et al., 2011;Lai et al., 2012). Very recently, Rudelson and Vershynin (2010) gave a comprehensive survey on the extreme singular values of random matrices.
It is well-known that the classic singular value is defined in terms of l 2 -norm, then a natural question would be what if one defines the singular value by the l qquasinorm for 0<q≤1 and l p -norm for p>1. There were some remarkable results by other researchers on the largest singular values of random matrices in the l 2norm. Geman (1980;Yin et al., 1988) showed that the largest singular value of random matrices of size m×N with independent entries of mean 0 and variance 1 tends to m N + almost surely. The largest and smallest qsingular values of pregaussian random matrices for 0<q≤1 were studied in (Lai and Liu, 2014), which has applications in a technique of signal processing (Foucart and Lai, 2010;Lai and Liu, 2011) and other areas. Similar to the q-singular value when 0<q≤1, the strictly convex largest p-singular value, in which p>1, can be defined and we will show the probabilistic estimate, precisely, the decay, on the upper tail probability of the largest strictly convex p-singular value, when the number of rows of the matrices becomes very large and the lower tail probability of theirs as well. These results provide probabilistic description or picture on the behaviors of the largest p-singular values of random matrices in probability.

The Largest p-Singular Value
The p-singular values of a matrix, in general, can be defined in the way of maximum of minimums or supremum of infimums. In largest p-sigular values can be defined as follows:

Definition 2.1
For an m×N matrix A, the largest p-singular value of A denoted as ( ) 1 p s (A) is defined as: For given p>1. Lai and Liu (2014), the following lemma on a linear bound for partial binomial expansion was established.

Lemma 2.2
For every positive integer n: The above lemma can be applied to estimate probabilities.

Lemma 2.3
Suppose ξ 1 , ξ 2 , ···, ξ n are i.i.d copies of a random variable ξ, then for any ε>0: For any given p>1. Proof. Given p>1, we have the relation on the probability events that: Is contained in: ( )

{ }
where, {i 1 , i 2 , · · ·, i k } is a subset of {1, 2 ···, n} and {i k+1 , ···, i n } is its complement. Let x = P(|ξ 1 | p ≤ε), then by the union probability: And applying Lemma 2.2, we have: Since the event (2.4) is contained in the event (2.5): To estimate the lower tail probability of the largest p-singular value, we have the following theorem on the lower tail probability of the largest p-singular value for p>1.

Theorem 2.4
Let ξ be a pregaussian variable normalized to have variance 1 and A is an m×N matrix with i.i.d. copies of ξ in its entries, then for every p>1 and any ε>0, there exists γ>0 such that: Which γ only depends on p, ε and the pregaussian variable ξ.
Proof. Since a ij is pregaussian with variance 1, then any ε>0, there is some δ>0, such that: For all j, because by the definition of the largest psingular value 2.1, choosing x to be the standard basis vectors of R N gives us max j ( ) ( ) ( ) , then (2.9) follows.
For the upper tail probability of the largest psingular value, p>1, we can derive the following lemma first by using the Minkowski inequality and discrete Hölder inequality.

Lemma 2.5
For p≥1, (2.1) defines a norm on the space of m × N matrices and: In which aj, j = 1, 2, ···, N, are the column vectors of A.
Applying the above lemma, an estimate we can derive easily for Bernoulli random matrices, whose every entry equals to 1 or-1 with equal probability (Tao and Vu, 2009), is the following theorem on the upper tail probability of the largest p-singular value of Bernoulli matrices for p>1.

Theorem 2.6
Let ξ be a Bernoulli random variable normalized to have variance 1 and A be an m×N matrix with i.i.d. copies of ξ in its entries, then: (2.14) One may conjecture that the bound might be 1 p m . However, considering the Bernoulli matrices, whose entries are in Bernoulli distribution, as special subgaussian matrices, the expectation of the largest psingular value may not be 1 p m . Indeed, let A be an m× m Bernoulli matrix and x be a non-zero vector in R m . The expectation of the largest p-singular value: For all x∈R m and particularly for x = (1,···, 1) ∈ R m , we have: the expectation of the l p -norm of the vector (X 1 , X 2 , ···, X n ). We also have the following result on the upper tail probability of the largest p-singular value of Bernoulli matrices for p>1.

Theorem 2.7
Let A be an m×m Bernoulli matrix with every entry equal to 1 or-1 with equal probability, then one has: For some K>0 and some absolute constant c>0. Proof. Let A = (∈ ij ) m×m and  For any K>2 and some absolute constant c>0. By Lemma 4.10 in (Pisier, 1999), there is a subset N which is a δ-net of 1 m p S − with cardinality: Finally, using the union bound of probability and an approximation of any point on the sphere by points of the δ-net, we obtain (2.17).
For the rectangular matrices, we have the following theorem on the upper tail probability of the largest psingular value of rectangular matrices for 1<p≤2.

Theorem 2.8
Let ξ be a pregaussian variable normalized to have variance 1 and A is an m× N matrix with i.i.d. copies of ξ in its entries, then for every 1<p≤2 and any ε>0, there exists K>0 such that: where, K only depends on p, ε and the pregaussian variable ξ.
Proof. By the discrete Hölder inequality and the definition of the largest p-singular value: , 0 , 0 We also know that there exists K>0 such that: (2.24) Therefore, we have: To have a full generalization, let us derive the following useful lemma.
In general, for the relation between ( ) ( )

Lemma 2.9
For any q≥1 and m × N matrix A: where, Proof. By the discrete Hölder inequality, we know In the same way, we also have: We have the following remarks on the above lemma.

Remark 2.10
One can also obtain the above lemma the operator duality on the dual spaces.

Remark 2.11
The above lemma allows us to obtain the probabilistic estimates on ( ) ( ) 1 p s A for p>2 by taking the transpose of A and using the estimates on ( ) ( ) 1 q T s A . Thus using the duality lemma, we obtain.

Theorem 2.12
(Lower tail probability of the largest p-singular value of rectangular matrices, p>2). Let ξ be a pregaussian random variable normalized to have variance 1 and A be an m×N matrix with i.i.d. copies of ξ in its entries, then for every p>2 and any ε>0, there exists γ>0 such that: which, γ only depends on p, ε and the pregaussian random variable ξ. Moreover, we have the upper tail probability of the largest p-singular value of rectangular matrices for p>2.

Theorem 2.13
(Upper tail probability of the largest p-singular value of rectangular matrices, p>2). Let ξ be a pregaussian variable normalized to have variance 1 and A is an m×N matrix with i.i.d. copies of ξ in its entries, then for every p>2 and any ε>0, there exists K>0 such that: where, K only depends on p, ε and the pregaussian variable ξ.

Numerical Experiments
In general, matrix p-norms are, in fact, NP-hard to approximate if p ≠ 1,2,∞, on which one can refer to (Hendrickx and Olshevsky, 2010;Liu, 2014;Higham, 1992). In this section, however, we would like to show the results from some numerically computable experiments on the p-singular value for p>1 and qsingular value for 0<q≤1 of random matrices.
For p = 2, we plot the largest 2-singular value of Gaussian random matrices of size n×n, where n runs from 1 through 100. Figure 1 this graph shows that the 2singular value is ( ) O n .
For p = 1, in the first numerical experiment we plot the largest 1-singular value of Gaussian random of size n×n, where n runs from 1 through 100. Figure 2 the graph shows that the largest 1-singular value is O (n).
In the second numerical experiment for p = 1, we plot the largest 1-singular value of Gaussian random matrices of size n×n, where n runs from 1 through 200. Figure 3 the graph shows that the largest 1singular value is O (n).
In the third experiment for p = 1, we plot the largest 1-singular value of Gaussian random matrices of size n×n, where n runs from 1 through 400. Figure 4 the graph shows that the largest 1-singular value is O (n).
For p = ∞, we plot the largest ∞-singular value of Gaussian random matrices of size n×n, where n runs from 1 through 500. Figure 5 this graph shows that the ∞-singular value is O (n). Higham (1992), the p-norm of a matrix of size m by n was estimated reliably in O (mn) operations and an algorithm that can estimate the p-norm in a specific accuracy, within a factor of 1 1 p n − , was provided. Using this algorithm, we plot the largest 4-singular value of Gaussian random matrices and Bernoulli random matrices of size m×n, where m and n run from 1 through 81 Fig. 6 and 7.