© 2010 Science Publications Lower-Dimensional Feature Sets for Template-Based Motion Recognition Approaches

Problem statement: In template-based motion recognition approaches, fe ature sets are computed from the template for classification. Hu i nvariants are widely employed for this purpose since its inception. However, development of lower- dimensional feature vector sets is required for faster computation along with robust recognition. T he concept of reduced size of Hu moment is really interesting. From its inception, seven higher order s Hu moments have been employed by many researchers without considering why seven and why not less numbers. Approach: In this study, we analyzed with various feature sets with different n umber of Hu moments and rationalized that based on the characteristics of central moments, it is not n ecessary to employ all the seven moments in every applications and, in that way, we can reduce the co mputational cost and make it faster. Results: Based on various feature vectors sets, it is evident that we can use lower dimensional feature vectors for o ur Directional Motion History Image (DMHI) method and other methods. Conclusion: Therefore, we can conclude that we do not need all seven invariants, rather 1st two or three invariants seem enough-as w e are not reproducing the image. Higher invariants ar e noisy and hence can be ignored. The 0th order moment for Energy images provide enough information about the mass area and hence, no need to calculate the other seven invariants.


INTRODUCTION
Action understanding and motion recognition has various important applications and therefore, development of robust action classification methods are crying needs in the computer vision arena. For classification and action recognition, moment invariants are widely employed (Ahad et al., 2008a) to develop smart feature vector sets. Moment invariants were firstly introduced to the pattern recognition community by (Hu, 1962), who employed the results of the theory of algebraic invariants and derived his seven famous invariants to rotation of 2D objects. Since its inception, the Hu (1962) invariants became classical. The use of moments for image analysis, object representation and recognition was inspired by these invariants. However, both for image reconstruction and pattern recognition, researchers employ all the seven invariants in their study for shape analysis. Even though it is assumed that higher order moments are less stable, detailed experimental analysis has been unattended and hence, all the invariants are computed. In general, gross image shape is represented well by the lower-order moments and high-order moments only reflect the subtleties of a silhouette or boundary image. We find that complexity increases dramatically with increasing order and their containment of redundant information about shape.
We estimated these invariants to calculate feature vectors using our Directional Motion History Image (DMHI) method (Ahad et al., 2008b;2009). We employed seven moments to one moment, with and without normalized 0th order moments, to create 12 different feature vectors for classification. It seems that 0th order moment is not required to consider, though for energy images, it can give key information about the mass of the motion area and hence, we can ignore to employ seven moments for energy images. However, instead of using seven moments for energy images, we can consider only the 0th order moments that provide the total object area, as it seems sufficient and, in this way, we can reduce feature vectors. Therefore, these feature vectors can significantly reduce the computational cost without sacrificing the recognition rates. We vividly notice that the average recognition rates are still in satisfactory ranges even with lowerdimensional feature vectors. We also present various graphical demonstrations by which it becomes clear that the higher order moments are less useful for recognition purposes. It is evident that higher the invariants, more the variations in them. Higher the order for invariants, more noisier and unstable natures they demonstrate. We note that the absolute error rate is much higher for higher order moments compared to the lower order moments. Therefore, for pattern recognition purpose, lower order moments are the key to achieve more sound recognition.
The concept of reduced size of Hu moment is really interesting. From its inception, 7 higher orders Hu moments have been employed by many researchers without considering why 7, why not less numbers. We analyzed with various feature sets with different number of Hu moments and rationalized that based on the characteristics of central moments, it is not necessary to employ all the seven moments in every applications and, in that way, we can reduce the computational cost and make it faster. Therefore, we can conclude that we do not need all 7 invariants, rather 1st 2/3 invariants seem enough-as we are not reproducing the image. Higher invariants are noisy and hence can be ignored. The 0th order moment for Energy images provide enough information about the mass area and hence, no need to calculate the other seven invariants.
Background: As pointed by Ahad et al. (2008a); Moeslund et al. (2006); Poppe (2007) and Aggarwal and Cai (1999), human activity and motion recognition has various paradigms and various approaches are considered for pattern recognition. Hu moments are widely used by many researchers for image representation or image reconstruction, utilized as pattern recognition features in a number of applications, such as, action recognition (Ahmad and Lee, 2006;Bradski and Davis, 2002;Bobick and Davis, 2001;Dudani et al., 1977), for fingerprint verification (Yang and Park, 2007), texture classification (Campisi et al., 2004;Nacereddine and Tridi, 2005), rapid matching of video streams, data matching (Wong and Hall, 2002), character recognition (El-Khaly and Sid-Ahmed, 1990;Tsirikolias and Mertzios, 1993), image normalization (Gruber and Hsu, 1997) and estimation of position and the attitude of the object in 3-D space (Mukundan and Ramakrishnan, 1996).
In general, gross image shape is represented well by the lower-order moments and high-order moments only reflect the subtleties of a silhouette or boundary image (Mukundan and Ramakrishnan, 1996). Hence, one may not even need all seven moment invariant functions to design a classifier (Mukundan and Ramakrishnan, 1996). Prokop and Reeves (1992) noted in their survey on moment-based techniques that most practical experiments have shown little improvement in identification performance when moment orders are increased beyond order 4 or 5 and in general, high order moments are very sensitive to noise and less stable (Celebi and Aslandogan, 2005;Prokop and Reeves, 1992;Shen and Ip, 1999;Teh and Chin, 1998).
In another research, for shape analysis and classification of weld defects in industrial radiography, only the first two Hu invariants are employed along with some other geometric parameters (Nacereddine and Tridi, 2005). The first two invariants give measures in relation with the pixel spreading in comparison with the center of mass (Nacereddine and Tridi, 2005). The first two moment invariants were used by Hu (1962) to represent several known digitized patterns in a two-dimensional feature space. In another research, Rizon et al. (2006) used only the first invariants for object detection. They claimed that from 2nd invariant onward, the invariants are insignificant. Bidoggia and Gentili (2002) pointed that it is possible to see that the 4th and the 5th moments have the larger standard deviations (especially for R-G and B-Y channels in color images) and that the moments are less stable under rotation and scaling. They also commented that this is not surprising, because the higher order moments are the less stable. In order to maintain a good stability, a smaller set of moments has been selected, composed by the five moments in grayscale image (and for color image, by three moments in both R-G and B-Y images). The higher is the order of the moment, the higher are the fluctuations (Bidoggia and Gentili, 2002). As all the complex moments are approximate, the last moments are the less stable, because they depend on higher power of uncertain numbers. One study by (Teh and Chin, 1998) showed that higher order moments are more vulnerable to white noise, thus making their use undesirable for pattern recognition with higher order moments.
However, these invariants have several drawbacks. Information redundancy is one of the drawbacks. Since the basis is not orthogonal, these moments suffer from a high degree of information redundancy (Celebi and Aslandogan, 2005). Moreover, in the presence of noise, the computed Hu invariant moments, begin to degrade. Also, large variation in the dynamic range of values may create instability. Since the basis involves powers of p and q, the moments computed have large variation in the dynamic range of values for different orders. This may cause numerical instability when the image size is large (Celebi and Aslandogan, 2005). The concept of lowerdimensional feature vector sets and the relevance of it for action recognition by employing our directional motion history image method is presented below. Moreover, we employ other methods for further analysis.

MATERIALS AND METHODS
Since its inception, seven Hu moments are widely used. Similarly, we employ Hu moments to develop feature vectors for the Directional Motion History Image (DMHI) representations for each activity. Here we compute the (DMHI) templates based on a threshold ξ on pixel values: In this equation, this vector is split into four different channels according to horizontal and vertical directions: From these DMHI templates, we get its binaries energy image templates called DMEI. In this study, we employ these templates.
Lets define the 2D (p+q) th order Cartesian moment m pq of a density distribution function ρ(x,y): where, p and q are the order of the moments in the x and y axes, respectively. A complete moment set of order n consists of all moments, m pq such that p+q≤n and contains 1 m x y f x, y NM = = = ∑∑ Hu derived relative and absolute combinations of moment values that are invariant with respect to scale, position and orientation based on the theories of invariant algebra that deal with the properties of certain classes of algebraic expressions which remain invariant under general linear transformations. These moments are not invariant to geometrical transformations. To achieve invariance under translation, we need to get central moment. The central moment µ pq can be defined as: Here, 10 00 01 00 x m / m , y m / m = = . This is essentially a translated Cartesian moment, which means that the centralized moments are invariant under translation. The first four orders (i.e., (p+q) is from 0-3) are defined as: For the second and third order moments, following seven orthogonal invariants are achieved to calculate feature vectors. The first six moments are rotation, scaling and translation invariants. 7th moment is skew (and bi-correlations) invariant that enables it to distinguish mirror or otherwise identical images. The seven invariants are:  where, h denotes the motion history image components and e stands for energy components of the DMHI method. However, we notice that for Hu moment, complexity increases dramatically with increasing order and their containment of redundant information about shape. The normalized 0th order moment provides a ration of the area. It is a measure of compactness.

RESULTS
We employ seven moments to one moment, with and without normalized 0th order moments, to create 12 different feature vectors for classification (Table 1). We also consider moments only for history templates and only for energy templates and achieved different recognition results. It seems that 0th order moment is not required to consider, though for energy images, it can give key information about the mass of the motion area and hence, we can ignore to employ 7 moments for energy images. However, instead of using seven moments for energy images, we can consider only the 0th order moments that provide the total object area, as it seems sufficient and, in this way, we can reduce from 64-36D feature vectors.
For recognition purpose, we employ our dataset of various aerobics. Ten different actions (e.g., Body stretching; Waving arms, bending and straighten legs; Turning the arms; Bending the chest; Bending the body sideway (left); Bending the body-front and back; Waving arms and twisting the body to the left; Bending and straightening arms up to shoulders, legs move; Bending the body diagonally to bottom-both sides and Bending the body sideway (right)) from eight various subjects are taken from an uncelebrated frontal-view digital video camera with almost constant illumination condition in indoor environment. No special markers or dress or arrangement were considered in this experiment. Also the dress, height, size and age of the subjects were different for each person. The frame has resolution of 320×240 pixels. For classification, we consider leave-one-out cross-validation approach. In Table 1, we present some comparative recognition results for the DMHI method with various feature vector sets (Ahad et al., 2008c). In this process, we employ knearest neighbor classification method for recognition based on our database. In this analysis, we tried with seven moments to one moment to make 12 feature vector sets for classification (Ahad et al., 2008c). It seems that 0th order moment is not required to consider, though for energy images, it can give key information about the mass of the motion area and hence, we can ignore to employ 7 moments for energy images.  If we consider 7 or 6 or 5 or 4 moments to calculate feature vectors for each history and energy template, we achieve the same recognition rate. We evaluated the top five results for all recognitions and we did not find any change in its distribution, as the distances among these are almost the same even if we consider 7 or 4 moments. However, if we use the first 3 Hu moments, then we notice changes among the results even though the recognition result is not varying significantly. Finally, we tried with history images only and energy images only to get recognition result and it shows poor recognition as it is evident that for better recognition, we need to consider both motion history and energy templates. Therefore, reduced numbers of Hu moments are suitable and well-fit for this DMHI method. In this way, we can reduce the running time.
Some of these different feature vector sets can be shown as (every sub-equation here denotes feature vectors of 56, 40, 36, 32 and 24D respectively): 1 7 1 7 40D For example, FV 56D consists of seven invariants for both history and energy templates, whereas, FV 32D set is composed with seven invariants for history images along with the normalized 0th order moments of the energy images. Therefore, these feature vectors can significantly reduce the computational cost without sacrificing the recognition rates much.
We can vividly notice that the average recognition rates are still in satisfactory ranges even with lowerdimensional feature vectors. We find it a wonderful development with the DMHI method, because in all other approaches that have employed Hu invariants in their works, have calculated the seven invariants, even though higher invariants are more noise-prone. In the case with the DMHI, the deployment of lower dimensional feature vectors can also produce good recognition results and we achieved average recognition rates of 91%. Therefore, we can find a faster recognition strategy with the DMHI approach considering lower-dimensional feature sets. In the similar fashion, we have analyzed with the Hierarchical Motion History Histogram (HMHH) method (Meng et al., 2006). We consider four patterns in this case and computed with various feature sets. We found that until the initial three invariants, the recognition rates are almost unchangeable. Similarly, by employing the basic Motion History Image (MHI) method by Bobick and Davis (2001), we achieve similar conclusions. Therefore, we can conclude that for recognition purpose, it is not required to employ all invariants-rather initial few invariants seem enough for template-based motion recognition methods.

DISCUSSION
Here present various graphical demonstrations by which it becomes clear that the higher order moments are less useful for recognition purposes. It is evident that higher the invariants, more the variations in them. Figure 1 shows the variations of the 64 invariants (i.e., seven Hu invariants and normalized 0th order moments for each of the four history images and four energy images). In Table 2 and Fig. 1-5, we use 'fv' to denote 'feature vector number'. (i.e., fv1~fv7). From this Fig. 2, it is clearly understood that the first invariant is stable compared to its higher invariants. Higher the order for invariants, more noisier and unstable natures they demonstrate. -these clearly shows that higher the invariants, more the variations Figure 3 shows the normalized 0th order moments (i.e., fv57~fv64) and the 1st invariant for all eight directional-history and directional-energy templates (i.e., fv1, fv8, fv15 and fv22 are for four DMHI templates and fv29, fv36, fv43 and fv50 show the corresponding 1st invariants for four DMEI templates respectively). From Fig. 4, graph for the 1st invariants are presented. We see that the range for y-axis is 0-0.9 only and the ranges of values are from -0.77 to -0.47 (for 1st invariants). So the variation is less. For 6th and 7th invariants, we will see wide range of values.
In the similar fashion, we compared the 6th and 7th invariants for the same action for eight different persons. Figure 5 demonstrates these for 6th invariants (i.e., fv6, fv13, fv20, fv27, fv34, fv41, fv48 and fv55 for four DMHIs and four DMEI respectively) and for 7th invariants (i.e., fv7, fv14, fv21, fv28, fv35, fv42, fv49 and fv56 for four DMHIs and four DMEI respectively). It is clearly seen that the y-axis is very large (0-14.0) compared to that of Fig. 5 (0-0.9). action (for both energy and history images). Yaxis is larger (0~-14) than the same for 0th and 1st invariants (0~-0.9) Also the variations among the same invariant for different persons are high and hence less stable. The range for the DMEIs is from-4.4-13.15 and the same for DMHIs is from-3.55-10.0. These features show clearly that higher invariants are more unstable and noisy. We also analyze the absolute error for some invariants to see the variation of the invariants from the first person to the rest of the subjects, which demonstrate that the higher order moments are noisy and unstable. Therefore, for pattern recognition purpose, lower order moments are the key to achieve more sound recognition.

CONCLUSION
Since its inception, the Hu's invariants became classical and, despite of their few drawbacks, numerous works have been devoted to various application areas. The use of moments for image analysis and object representation was inspired by these invariants. We notice that most of the cases, both for image reconstruction and pattern recognition, researchers employ all the seven invariants in their study for shape analysis. Even though it is assumed that higher order moments are less stable, detailed experimental analysis has been unattended and hence, all the invariants are computed.
We analyzed with various feature sets with different number of Hu moments and rationalized that based on the characteristics of central moments, it is not necessary to employ all the seven moments in every applications and, in that way, we can reduce the computational cost and make it faster. Based on various FV sets, it is evident that we can use lower dimensional feature vectors for the DMHI and other methods. HMHH and MHI methods are also exploited and we achieved similar results. Therefore, we can conclude that we do not need all seven invariants, rather 1st two or three invariants seem enough-as we are not reproducing the image. Higher invariants are noisy and hence can be ignored. The 0th order moment for Energy images provide enough information about the mass area and hence, no need to calculate the other seven invariants.

ACKNOWLEDGMENT
Researchers are thankful to the students who participated in creating the action database.