Study and Realization of the Basic Methods of the Calibration in Stereopsis For Augmented Reality 1

Augmented reality (RA) is aimed at ameliorating our perception of the real world by addition of elements which are not a priori observable by the human eye. RA is define as a system able of combining real and virtual pictures, in 3D and real-time. The virtual objects must be cast in a consistent manner in real pictures. In practice, objects are positioned in a landmark linked to the scene, and objective is to determine the point of view of the camera (position and orientation) in this landmark, which we are going to see in the next scene of calibration of camera. Camera calibration is a fundamental problem in computer vision. In this paper, we developed a semi automatic calibration method for our augmented reality system, which uses corner detection to extract pixel coordinates of projection points and uses homography to build the correspondence. So my method allows reconstruction 3D of the scene. To find resolution they use projections instead of the Euclidean methods which allow acquiring a good result with average error between the re-projected points and pixel points is round 2 pixels. Potential applications of this method are: applications real-time in the domains of medicine, service of fabricated objects, industry.


INTRODUCTION
RA is a discipline the majority of which of the advances took place these last ten years. System of composition of picture in scene of post production is used today as well by the producers of special effects as by the general public, who can obtain several software of trade (the degree of interactivity and the complicacy of the taken into account applications being variable from a product to the other one).The camera is the essential tool around which develops the vision in RA.
The virtual objects must be cast in a consistent manner in real pictures. In practice, objects are positioned in a landmark linked to the stage, and objective is to determine the point of view of the camera (position and orientation) in this landmark The projection model most fluently used is the model of perspective projection, says model pinhole which drives to a geometric model. It links the forming of pictures on the retina of the camera to a central projection [4] . In the following they are going to represent important stages which we have study and accomplish. To cross coordinates defined in the landmark of the scene in coordinates pictures expressed in pixels (Fig.  2, three scenes are necessary [5] :

MATERIALS AND METHODS
All these pictures are dimension (256*256) pixels. Programs are developed under Matlab on a working station of type PC, Pentium IV.
The methods which we accomplished in this paper are: Methods of calibration of both cameras, Methods of estimate of the projection matrix, Methods of estimate of the fundamental matrix.
And for it they are going to introduce following notions: A three-dimensional displacement: Points 3D expressed in a landmark of the scene are subjected to a change of landmark to pass to the landmark of the camera. This change of landmark includes six parameters therefore: three for rotation and three for translation. These parameters are others than position and orientation of the camera, they are called "extrinsic parameters".

Change of coordinates:
To cross pixels to coordinates, coordinates are subjected to a familiar transformation of plan. This last, includes four parameters called "intrinsic parameters", they represent the internal characteristics of the camera. This transformation can be spelled under following matrix form:  The three named transformations ci over can be encoded in the same matrix M (3x4) such as:

M=AID
(4) Where: D: represent three-dimensional displacement. I: represent projection 3D-2D. A: represent the change of coordinates. The matrix M is said matrix of perspective projection and represents the model pinhole of the camera. The stocks of this matrix being unknown, the operation which consists in determining them is called calibration of the camera.

Calibration :
The calibration is a technology allowing determining the parameters of the camera, which are two types: intrinsic and extrinsic. Intrinsic parameters model the internal characteristics of the camera, extrinsic parameters represent position and rotation of the camera in relation to the coordinates of the real world, they allow defining the relation which links the landmark picture in the landmark camera [6] .
The determination of the parameters of the camera is based on the use of two techniques, (who can be combined): based techniques sensors (active vision) and based techniques vision (passive vision). Based techniques sensors require the job of appropriate equipment. They allow acquiring directly position and orientation of the camera every other minute. However, these techniques can be used only in environments of restrained size and they are besides, sensitive to the presence of materials unsettling the good functioning of the sensor (for instance of iron in the case of magnetic sensors) [7,8] . Based techniques vision use pictures captured by the system of acquisition to find the parameters of the camera, the interest of these techniques is to require no instrumentation of the scene. Certain systems can use a hybrid approach: for instance, insert at the same time magnetic sensors and markers positioned in the scene. This allows them to acquire the precision linked to the use of markers at the same time as the robustness given by sensors. Our interest being carried on based techniques vision, these include two methods. The first one, said strong calibration, tries to make the research of a test pattern 3D ( Fig.3.a) or of a pattern 2D (Fig.3.b) having characteristics known in the real scene. This research can be made in a automatic or manual way. The second method tries to estimate the matrix of perspective projection by extraction of natural indications by using statistical methods from two or several pictures. It is said auto calibration or calibration weak-willed person [9] . Weak calibration: The weak calibration (called also based calibration models) rests points of reference Pi on the knowledge of three-dimensional coordinates 3D of n (at least six points in the case of a test pattern and at least 4 points in the case of a pattern) and of their projections pi in retinal plan, measured in form of coordinates pixel (for instance, the use of the detector of corners " Harris "). From this correspondence 3D-2D, we shall see that it is possible to calculate the stocks of the screening matrix what will later allow to deduct extrinsic parameters dead intrinsic parameters [7].

Appariement 3D-2D:
To match coordinates 3D with their correspondents 2D, requires the appariement of coordinates 3D of corners of the windows of the test pattern with their correspondents 2D shown by HARRIS. In [1] , Zonglei Huang and Boubakeur Boufama offer a half automatic method allowing finding appariements 3D-2D. Two homographies are calculated (Fig.5), each of her represents the transformation of points 3D of the scene towards a point picture.
Equations below represent transformation:  The determination of the stocks of homographies 2 requires the knowledge of appariements 4 for every homographie. To solve the system below they used method SVD (Singular Value Decomposition): Points picture calculated with the aid of this two homographies are used as a reference (an error δ 2 pixels) for points discerned by an automatic detector of corners (they used HARRIS). The counting of error is established according to following expression: Expression allows to calculate distance (error) between points discerned by HARRIS and re-projection of coordinates 3D.

The projection matrix:
The relation between a point of the scene P (X, Y, Z, 1) and its projection p (u, v, 1) is given by following relation: Equation (8) can be rewritten as follows: (9)  Where: mij is the elements of the matrix M. To search the elements of the matrix M, six points are necessary to solve the system of equation below, by using the method of decomposition in peculiar stocks (SVD).   Stereopsis is an important method to extract structure 3D from a scene. It consists in deducting the relief by looking at a scene with two cameras, disposed as the human eyes, which are going to give two pictures (stereoscopic system).
Due to the distance of cameras, pictures of this pair are not without relation. So an object of the scene will see its pictures brought forward by a picture of pair on other one, by a certain number of pixels. This gap is called "difference» [10] . If they are able of allocating a difference in every pixel of picture, they will allocate a difference by extension in all points of an object on picture. They will be then capable of replacing all points of this object in the space, therefore rebuilding the object in the scene. In effect, the distance of the object in cameras is conversely proportional to the difference of pixels picture which constitute it. All difficulty domiciles in the appariement of both pictures of pair (conscript: stereoscopic appariement). They imagine then all difficulty of problem when they think that an everyday scene can contain a big number of objects, put at various distance. The perception of this type of things at the human beings is so natural that it is very difficult to arrest functioning [10] .
Geometry epipolar: This technology is founded on algebraically ownership of projective geometry. It rests on the counting of the « fundamental matrix » which imposes geometric pressures between two views, said « compelled ear polar ». The determination of this matrix requires a minimum of eight correspondences of points. In the field of the stéréovision, geometry polar ear, illustrated in the Face 8, provides following important result: Given a point p1 in picture 1, its corresponding stereo p 2 belongs necessarily to a right of the picture 2 entirely defined by data of p 1 . This right is called right polar ear linked in p 1 .
Thanks to this important geometric ownership, the research of the corresponding stereo of a point of picture 1 comes down to a research 1D in picture 2 (along right epipolar linked) rather than a research 2D (exhaustive research in all picture 2).
Right ear polar of picture 2 form a beam of rights of centre E2, the picture of C1 in the camera 2. E2 is called the ear pole of picture 2. Situation is symmetrical and E1, the picture of C2 in the camera 1, the ear is pole of picture 1. They can also consider E1 and E2 to be the intersections of right C1C2 with plans pictures P1 and P2 respectively.

The fundamental matrix:
The fundamental matrix is the algebraically presentation of geometry polar ear. His main characteristics are translated in a pair of pictures: data of a point in the one right corresponds to the data of a right in other one, this duality correspond to the projection of the screening ray of the point. The fundamental matrix satisfies condition that for every pair (q, q ') of points matched by the stereoscopic system, they have relation: This translates the fact that the point q belongs to right (Fr q ') and reciprocally. One adopting the following writing: ij F f and q (u v 1) = = The previous equation writes: (12) The resolution of this system is accomplished by the application of SVD.
Correction: There is a particular case (in geometry epipolar) interesting which corresponds to parallel polar beams of ear in both pictures at the same time: in that case, the research of the corresponding pixels is greatly simplified. This occurs only if right C1C2 is parallel on plans pictures of both cameras. In practice, this special situation is very difficult to accomplish mechanically, but the operation of correction of pictures allows precisely coming down to it by a transformation of a posteriori pictures. To note that the scene of correction introduces distortions of pictures [11] . To determine the correspondent of a pixel of the first picture in second, it is not enough of course to compare simply the intensity of pixels two -two: they measure likeness between two pixels by calculating a score of correlation determined on their neighborhood (in practice, considered neighborhood is rectangular windows centered on the examined points). Given a pixel in the first picture and his linked neighborhood, his correspondent in the second picture is the one who maximizes (or minimizes score, it depends on the criterion of used correlation) the score of correlation along the line (Face 9). There are numerous criteria of correlation; the choice of criterion is often empirical [11] .

RESULTS
Results acquired, given on the picture mentioned below present the stocks of the eight matrices of projection acquired with the aid of pictures presented in fig.10.  Once the image coordinates of the corners are identified and the correspondences between image points and space points are established, a camera calibration is called to calculate the projection matrix. When the user clicks on the calibration button in the interface, the program performs the calibration using the data correspondences between. Table 1 contains the calibration resultat, the projection matrix ( fig.11). The average error between the re-projected points and pixel points is round 2 pixels Once they determined the stocks of both projection matrixes, they can estimate the fundamental matrix between both pictures. The fig.12.b. Show epipolar right on which the correspondent of the point is shown in fig.12.a. a). The concerned point (b). right epipolar By using what was explained above, they could insert two objects as show it the face fig.13, while taking into account the problem of occlusion.

CONCLUSION
The epipolar geometry allows to match pictures of sequence two -two. In a system of augmented reality, the virtual objects must be cast in a consistent manner in real pictures. In practice, objects are positioned in a landmark linked to the scene, and objective is to determine the point of view of the camera (position and orientation) in this landmark. They can conclude also that besides these extrinsic parameters, we must know the intrinsic characteristics of the camera such as the focal length for instance, which define the projection of the virtual object in plan picture of the camera. Besides, to cause the position of the initial to discern in pictures of sequence, they necessarily need a specific tool. Our calibration method gives an error of about 2 pixels.