Analysis of Unaided Vision Data using New Decomposition of Symmetry

Problem statement: First, this study considers how the structure of sy mmetry for probabilities is decomposed into two structures. Se condly, this study infers the structure of unknown probabilities which indicates how the right eye is better (or worse) than the left eye for three kinds of data on unaided distance vision of (1) women in Bri tain, (2) students in an university of Japan and (3 ) pupils in elementary schools in Tokyo, Japan. This study proposes a new decomposition of symmetry model for probabilities and analyzes these vision d ata using the decomposition. Approach: This study considers a new decomposition theorem that for the probabilities the symmetry model (indicates that the right eye vision is symmetric to the left eye v ision) holds. Also this study analyzes the vision d ata using this decomposition. Results: From the statistical approach, we can see that (1) for the vision data of women, the right eye is better than the left eye and the mean of right eye is not equal to the mean of left eye, (2) for the vision data of students, the right eye is worse than the left eye and the mean o f right eye is not equal to the mean of left eye and (3) fo r the vision data of pupils, the right eye is symme tric to the left eye and the mean of right eye is equal to the mean of left eye. Conclusion: When the symmetry model fits the data poorly, this new decomposition is useful for seeing which of decomposed two models influences stronger. We can see the structure of as ymmetry for vision data in more details.


INTRODUCTION
Consider three sets of data on unaided distance vision of (1) 7477 women aged 30-39 employed in Royal Ordnance factories in Britain from 1943-1946 (Table 1), (2) 4746 students aged 18 to about 25 including about 10% women in Faculty of Science and Technology, Science University of Tokyo in Japan examined in April 1982 (Table 2) and (3) 3168 pupils comprising nearly equal number of boys and girls aged 6-12 at elementary schools in Tokyo, Japan, examined in June 1984 (Table 3). In these data, the row variable is the right eye grade and the column variable is the left eye grade with the categories ordered from the Best (1) to the Worst (4). The data in Table 1 have also been analyzed by many statisticians including Stuart (1955); Bishop et al. (2007);McCullagh (1978); Goodman (1979); Agresti (1983); Tomizawa (1985;1993); Miyamoto et al. (2004); Tomizawa et al. (2006) and Tomizawa and Tahata (2007). The data in Table 2 have been analyzed by Tomizawa (1984;1985). The data in Table 3 have also been analyzed by Tomizawa (1985) and Miyamoto et al. (2004).
For these vision data, an individual's right eye grade is strongly associated with his/her left eye grade because many observations concentrate on (or near) the main diagonal cells in each table. Therefore, instead of independence between an individual's right eye grade and his/her left eye grade, we are interested in whether or not an individual's right eye grade is symmetric to his/her left eye grade and in how both eyes are symmetric or asymmetric, for example, whether or not the mean of right eye grade is equal to the mean of left eye grade. In addition, we are interested in seeing the reason by considering a decomposition of structure of symmetry if there is not the probability structure of symmetry between the right eye vision and the left eye vision in each table. Many models of symmetry and asymmetry for probabilities have been proposed by many statisticians. Also decompositions of the symmetry model have been given by some statisticians.  1943-1946from Stuart (1955). (The parenthesized values are the maximum likelihood estimates of expected frequencies under the CLDPS model.)  Left eye grade  --------------------------------------------------------------------------------------------------------------------------------------------------Right eye grade Best (1) Sec (2) Third (3) Worst (     Best ( The purpose of this study is (1) to review some models and decompositions, (2) to give a new decomposition of symmetry and (3) to analyze three sets of vision data using the new decomposition.

Reviews of models and decompositions:
Consider an R×R square contingency table with the same row and column classifications. Let p ij denote the probability that an observation will fall in the ith row and jth column of the table (i =1,…,R; j =1,…,R). The symmetry (S) model is defined by: (1948). For the vision data, this indicates that the probability that an individual's right eye grade is i and left eye grade is j is equal to the probability that his/her right eye grade is j and the left eye grade is i. The quasi-symmetry (QS) model is defined by: Caussinus (1965). A special case of this model with {δ i = δ}is the S model. Although the detail is omitted, the QS model indicates the symmetry of odds ratios with respect to the main diagonal of a table.
The marginal homogeneity (MH) model is defined by: Stuart, 1955). For the vision data, this indicates that the probability that an individual's right eye grade is i is equal to the probability that his/her left eye grade is i (i = 1,…,4). Caussinus (1965) gave the decomposition of the S model as follows: Theorem 1: The S model holds if and only if both the QS and MH models hold.
Each of S, QS and MH models indicates the structure of symmetry for the probabilities in the square table as the vision data. As a model which indicates the structure of asymmetry (instead of symmetry), the linear diagonals-parameter symmetry (LDPS) model is given as: (1983). For the vision data, this indicates that the probability that an individual's right eye grade is i and left eye grade is j (> i) is j i − θ times higher than the probability that his/her right eye grade is j and left eye grade is i. If θ>1, then the right eye tends to be better than the left eye. A special case of the LDPS model with θ =1 is the S model. Also the LDPS model is a special case of the QS model obtained by putting {δ i = θ i }.
Let X and Y denote the row and column variables, respectively. For the vision data, X is the right eye grade and Y is the left eye grade. The mean equality (ME) of X and Y model is given by:

∑∑ ∑∑
For the vision data, (1) G ij for i<j indicates that the cumulative probability that an individual's right eye grade is i or below and his/her left eye grade is j (> i) or above and (2) G ji for i<j indicates that the cumulative probability that an individual's left eye grade is i or below and his/her right eye grade is j (> i) or above. Miyamoto et al. (2004) considered The Cumulative Quasi-Symmetry (CQS) model defined by: The CQS model is different from the QS model. Miyamoto et al. (2004) also considered the cumulative linear diagonals-parameter symmetry (CLDPS) model defined by: The CLDPS model is different from the LDPS model. The CLDPS model is a special case of the CQS model obtained by putting { i γ = ∆ i }. For the vision data, the CLDPS model indicates that the probability that an individual's right eye grade is i or below and his/her left eye grade is j (> i) or above, is ∆ j-i times higher than the probability that an individual's left eye grade is i or below and his/her right eye grade is j (> i) or above. If ∆>1, then the right eye tends to be better than the left eye. Yamamoto et al. (2011) gave the following theorem: Proof: If the S model holds, then both the CLDPS and ME models hold. Conversely, assuming that the CLDPS and ME models hold and then we shall show that the S model holds. We see: Similarly we see: Thus we see: where iĵ m is the maximum likelihood estimate of m ij under the model. The numbers of degrees of freedom for the S, CLDPS and ME models in Theorem 4 are R(R-1)/2, (R+1)(R-2)/2 and 1, respectively. Note that the number of degrees of freedom for the S model equals the sum of those for the CLDPS and ME models. Table 1-3. We shall analyze these data using Theorem 4. Table 4 gives the values of likelihood ratio test statistic G 2 for each model. Table 1: We see from Table 4 that each of QS, LDPS and CLDPS models fits the vision data of women (Table 1) well, however, each of S, MH, CQS and ME models fits these data poorly.

Analysis of vision data of women in
Since the S model does not hold for these data, the probability that a woman's right eye grade is i and her left eye grade is j (≠ i) is not equal to the probability that the woman's right eye grade is j and her left eye grade is i. Namely, a woman's right eye grade is not symmetric to her left eye grade.
The maximum likelihood estimates of parameters . Since the CLDPS model holds for these data, the probability that a woman's right eye grade is i or below and her left eye grade is j (>i) or above is estimated to be j î − ∆ times higher than the woman's left eye grade is i or below and her right eye grade is j or above. Sinceˆ1 ∆ > , a woman's right eye grade is estimated to be better than her left eye grade.
Since the ME model does not hold for these data, the mean of women's right eye grades is not equal to the mean of women's left eye grades. Using the sample , we see that the mean of women's right eye grades is estimated to be 2.275 and the mean of women's left eye is estimated to be 2.305. Therefore a woman's right eye grade is expected to be better than her left eye grade in the sense of the mean. We see from Theorem 4 that the poor fit of the S model is caused by the influence of the lack of structure of the ME model rather than the CLDPS model. Namely the fact that a woman's right eye grade is not symmetric to her left eye grade is caused by the fact that the mean of women's right eye grades is not equal to the mean of women's left eye grades.  Table 2: We see from Table 4 that each of QS, LDPS, CQS and CLDPS models fits the vision data of students (Table 2) well, however, each of S, MH and ME models fits these data poorly. Since the S model does not hold for these data, the probability that a student's right eye grade is i and his/her left eye grade is j (≠ i) is not equal to the probability that the student's right eye grade is j and his/her left eye grade is i. Namely, a student's right eye grade is not symmetric to his/her left eye grade.

Analysis of vision data of students in
The . Since the CLDPS model holds for these data, the probability that a student's right eye grade is i or below and his/her left eye grade is j (>i) or above is estimated to be j î − ∆ times higher than the student's left eye grade is i or below and his/her right eye grade is j or above. Sinceˆ1 ∆ < , a student's right eye grade is estimated to be worse than his/her left eye grade.
Since the ME model does not hold for these data, the mean of students' right eye grades is not equal to the mean of students' left eye grades. Using the sample proportions, we see that the mean of students' right eye grades is estimated to be 2.631 and the mean of students' left eye grades is estimated to be 2.602. Therefore a student's right eye grade is expected to be worse than his/her left eye grade in the sense of the mean.
We see from Theorem 4 that the poor fit of the S model is caused by the influence of the lack of structure of the ME model rather than the CLDPS model. Namely the fact that a student's right eye grade is not symmetric to his/her left eye grade is caused by the fact that the mean of students' right eye grades is not equal to the mean of students' left eye grades. Table 3: We see from Table 4 that all models fit the vision data of pupils (Table 3) well. Since the S model holds for these data, the probability that a pupil's right eye grade is i and his/her left eye grade is j (≠ i) is equal to the probability that the pupil's right eye grade is j and his/her left eye grade is i. Namely, a pupil's right eye grade is symmetric to his/her left eye grade.

Analysis of vision data of pupils in
Since the CLDPS model fits these data well, we shall test the hypothesis of ∆=1(i.e., the hypothesis that the S model holds) under the assumption that the CLDPS model holds. It can be tested according to the difference between the likelihood ratio statistic G 2 for the S model and that for the CLDPS model. The difference is 1.88 with 1 degree of freedom. Therefore we can accept the hypothesis of ∆=1in the CLDPS model, at the 0.05 significant level.
Since the ME model fits these data well, we shall also test the hypothesis that the S model holds under the assumption that the ME model holds. The difference between the G 2 value for the S model and that for the ME model is 8.23 with 5 degrees of freedom. Therefore we can accept the hypothesis that the S model holds under the ME model at the 0.05 significant levels. So, for pupils' vision data, we prefer the S model to each of the CLDPS and ME models.
Since the ME model fits these data well, the mean of pupils' right eye grades is equal to the mean of pupils' left eye grades. Under the ME model, the maximum likelihood estimates of the mean of pupils' right (left) eye grades is 1.301. This value is less than 2.5 being the midpoint of categories. Therefore a pupil's right (left) eye grade is expected to be close to Best (1) or Sec. (2), rather than Third (3) or Worst (4).

DISCUSSION
Theorems 1, 2, 3 and 4 may be useful for exploring the reason for the poor fit when the S model fits the data poorly. Especially, Theorem 4 would be useful when we are interested in the asymmetric structure of cumulative probabilities {G ij }, i ≠ j and the symmetry structure of marginal means in the square table as the vision data.

CONCLUSION
We have given a new decomposition in Theorem 4 and have analyzed three sets of data on unaided vision data using Theorem 4. It has been estimated that (1) for the vision data of women, the right eye is better than the left eye and the mean of right eye is not equal to the mean of left eye, (2) for the vision data of students, the right eye is worse than the left eye and the mean of right eye is not equal to the mean of left eye and (3) for the vision data of pupils, the right eye is symmetric to the left eye and the mean of right eye is equal to the mean of left eye which is expected to be close to `Best' or 'Second' rather than `Third' or `Worst'.