Description of 2D and 3D Coordinate Systems and
Derivation of their Rotation Matrices
Conventions:
In a 3D coordinate system, Xs, Ys, Zs will be used for object coordinates in the scanner coordinate system. This is the coordinate system from which the transformation is made.
X_{c },Y_{c },Z_{c} will be used for the object coordinates expressed in the camera coordinate system after they have been scaled by the camera lens. The Z_{c} value for the object will be the location of the image plane in the camera, the CCD sensor and will be equal to the focal length of the camera, f (at CCD sensor, Z_{c} = f). This is the coordinate system to which the scanner coordinates are transformed. X_{i, }Y_{i, }Z_{i} will be used to designate the object coordinates after they are transformed to a coordinate system which has as its origin, the camera coordinate system origin but before scaling by the lens occurs. X_{i, }Y_{i, }Z_{i} are unaffected by the camera lens optics. x, y, z will be used for the translation needed to move the origin of the scanner coordinate system to the origin of the camera coordinate system.
The derivation will first be explained used a 2D example. In a 2D planar coordinate system X_{s} and Y_{s} will be used for the coordinate system that corresponds to the scanner coordinates, that is the system from which the transformation is made. X_{c} and Y_{c} will be used for the coordinate system that corresponds to the camera coordinates, that is the system to which the X and Y coordinates are transformed. The terms camera and scanner are used here only to maintain continuity between the 2D and the 3D derivations. A 2D transformation does not apply to an actual transformation between scanner and camera coordinates.
In a 3D coordinate system Omega (ω) will describe rotation about the Xaxis, Phi (Ф) will describe rotation about the Yaxis, and Kappa (κ) will describe rotation about the Zaxis. Theta (θ) will describe rotation in a 2D planar coordinate system.
Derivation of 2D transformation
In a 2D planar coordinate system a counterclockwise rotation from the scanner coordinates to the camera coordinates can be accomplished with the following transformation matrix assuming that the origins of the two coordinate systems are located at the same spot. In a counterclockwise system positive rotation is in the counterclockwise direction. Negative rotation is in the clockwise direction.
Counterclockwise rotation will be used in all transformations. This appears to be the most commonly used convention, although it is possible to perform the transformations equally as well with a convention that uses a clockwise rotation. This may be the preferable convention for rotation because the crossproduct of two vectors which have been defined relative to each other in a counterclockwise convention has the positive direction of the result of the crossproduct pointing toward the viewer. That is, with the two vectors in the horizontal plane, the crossproduct is positive if it points up.
The black coordinate axes in Figure 1 are rotated counterclockwise to align them with the blue coordinate axes.
Figure 1. Counterclockwise rotation of coordinate systems,
transformation of X, Y to x, y. Point P is transformed from the
black coordinate system to the blue coordinate system.
Conversion from one coordinate system to the other is derived below.
Y = Xb + bP
x = oa + ab + bx
= X*cosθ + Xb*sinθ + bP*sinθ
= X*cosθ + (Xb + bP)*sinθ
= X*cosθ + Y*sinθ
y = cY + od = aX + od
= X*sinθ + Y*cosθ
The above equations for x and y are expressed below in matrix notation.
If the two coordinate systems have their origins separated by ΔX and ΔY distances, then a translation of the origin of the XY coordinate system must be made to the xy origin by adding the difference between their locations.
The transformation matrix is a 2X2 matrix.
A clockwise rotation from the scanner coordinates to the camera coordinates will use the following transformation matrix. The only difference is the signs for sinθ are reversed.
Derivation of the 3D transformation matrix
In a 3D coordinate system the terms right and left hand coordinate systems are used. This method of defining a 3D coordinate system has the positive direction of the Xaxis aligned with the thumb, the positive direction of the Yaxis aligned with the index finger, and the Zaxis aligned with the middle finger. The thumb and index finger are spread at 90^{o} and aligned with the plane of the palm. The middle finger is bent at 90^{o} to the palm.
If the coordinate system aligns with the appropriate fingers of the left hand it is called a lefthand coordinate system. And viceversa for the righthand coordinate system. There is no rotation possible that will align a lefthand coordinate system with a righthand coordinate system. However, the problem is solved by simply negating (multiplication by 1) the values of one of the axes of one of the coordinate systems. It makes no difference, which system or which axis.
Note: The Riegel Laser Scanner uses a righthand coordinate system. A camera with the positive direction of the Zaxis exiting the lens toward the object, X being the horizontal dimension with positive to the right, and Y being the vertical dimension with positive being up, uses a lefthand coordinate system. The camera can be converted to a righthand system by negating the values on the Zaxis.
Positive rotation about an axis is determined by aligning the thumb of the hand (right or left) in the positive direction of the axis and curling the fingers. The direction of the curled fingers is the direction of positive rotation. A right hand rotation has the same matrix form as a counterclockwise rotation in a 2D coordinate system. A left hand rotation has the same matrix form as a clockwise rotation in a 2D coordinate system.
Having three axes, a 3D coordinate system will require a 3X3 transformation matrix. The matrices to transform the scanner coordinates to the camera coordinates are described below. Righthand rotation will be used. The equations could as well be derived using a lefthand rotation.
Rotation about the camera xaxis
Rotation about the camera yaxis
Rotation about the camera zaxis
Note: Below the negation of the Zaxis value is disregarded, the camera and the scanner coordinates are assumed to use the same coordinate system. This will be corrected at the end of the derivation to reflect the reality of the Riegl scanner and the camera coordinate systems.
A single rotation about the Xaxis would have the form shown below.
Rotation about the three camera axes can be done in any order, but must then be done consistently throughout the computations in the same order. Generally, matrix multiplication is not transitive, that is [M] * [N] does not equal [N] * [M]. Rotation about the Xaxis followed by rotations about the Y and Zaxes is the common sequence that is used. Thus, a transformation from the scanner coordinate system to the camera coordinate system is accomplished with the following matrix operations.
The full transformation would have the form shown below.
or, expressed in complete form below
The combined transformation matrix M can be expressed as . And
the transformation from scanner to camera coordinates can be expressed as
Multiplying out the rotation matrices Mz, My, and Mx is shown below
Each element of the matrix is named according to its row/column position.
In order to bring the camera and the scanner coordinate systems into the righthand coordinate system of the scanner, it is necessary to multiply Z_{i} by 1.
In order to scale the coordinates of the camera to that of the image plane (X_{c}, Y_{c}) in the camera, X_{i }and Y_{i} must be scaled by the ratio of the focal length (f) to that of the Z_{i} value for each point. The first and second rows of the matrix equation are multiplied by the scaling factor f/Z_{i}. The third row of the matrix equation is left unchanged.
Refer to Derivation of Tranformation Parameter Computation for a 2D/3D Coordinates System From Laser Scanner Coordinates to Camera Coordinates for a more complete explanation of the transformation to camera focal plane coordinates.
Note: The magnification of a camera lens system is the ratio of the focal length to the distance to the object. If a camera lens has a focal length of 25 mm and this is considered to be 1X, then replacing the 25 mm lens with a 50 mm lens will increase the magnification by a factor of two and reduce the area that was imaged by a factor of ½. Assume that the CCD sensor has a width of 25 mm (this is very close to the actual width of high quality Digital Single Lens Reflex (DSLR) cameras. A lens with a focal length of 25 mm is used. The distance to the outcrop is 100 m. The picture that is taken will cover 100 m of width of the outcrop, assuming that the camera is aligned perpendicular to the outcrop and that the outcrop runs straight left to right. If a lens with a focal length of 50 mm is put on the camera, the magnification will be about 2X and will only cover 50 m of the outcrop. Likewise, if a lens with a focal length of 100 mm is put on the camera, the image will appear to be magnified by 4X the first image and will cover only 25 m of the outcrop.
Derivation of the Collinearity Equations (basis for scanner to camera transformation)
Below is a 3D transformation matrix without scaling by the camera lens.
The camera coordinate system has the origin located at the rear nodal point of the lens system. The Zaxis of the camera is parallel to the axis of the lens system and is perpendicular to the CCD sensor in the camera. The positive direction of the Zaxis is toward the object being photographed. Since the CCD sensor is located behind the focal point the CCD sensor is located at a Zcoordinate of (–f ), [f: focal length of the lens]. The CCD sensor represents the XY plane of the camera coordinate system and is orthogonal to the Zaxis. Note that with the CCD sensor (XY plane) located behind the focal point, inversion in the X and Y dimensions occurs. This is corrected by virtually moving the CCD sensor a distance (f) in front of the focal point. This places the image on the CCD sensor in the same orientation that the object has in space. It would be the same as taking a picture of the object and then holding the picture up in front of the camera. What is on the left side of the object appears on the left side of the picture. What is on the upper side of the object appears on the upper side of the picture.
The above transformation matrix transforms the scanner coordinates to camera coordinates but does not scale the dimensions to the scale of the CCD sensor. That is, the dimensional reduction that occurs due to the camera lens is not reflected in the transformation matrix at this point.
An image dimension is related to the object dimension by the ratio of the focal length of the lens and the distance to the object. Using the relationship of similar triangles as shown below, X_{c} = (f/Z_{i})*X_{s}, where f is the focal length of the camera, Z_{c} is the distance from the rear nodal point of the camera lens to the object and X_{s} is the dimension of the object.
The X_{i} and Y_{i} values will be scaled by the ratio f/Z_{i} yielding X_{c} and Y_{c} which are the coordinates on the CCD sensor. Thus, the matrix equation above when translated to the plane of the CCD sensor can be expressed as
The above complete transformation matrix is expressed in equation form as
What are called the collinearity equations are formed by substituting for Z_{i} in the equations for X_{c} and Y_{c}. The collinearity equations are
The scanner and camera systems are not compatible, one is left handed the other is right handed. They are made compatible by negating the values of one of the axes. By negating the value of the Z axis, the equations become
The following assignments are made in order to simplify the notation later, refer to the numerators and denominators of the above equations.
X_{c} and Y_{c} are shown below expressed in terms of U, V, W.
X_{c} and Y_{c } are the coordinates of a point on the camera CCD sensor with the origin of the coordinate system at the center of the CCD sensor. These are not UV coordinates. UV coordinates have the origin at the upper left corner of a photograph and have values between 0 and 1.0. X_{c} and Y_{c} have dimensions which relate to the size of the CCD sensor and range from 1/2 the sensor width to +1/2 the sensor width and from 1/2 the sensor height to +1/2 the sensor height. In solving for the transformation parameters (ω, Φ, κ, x, y, and z) four points are identified common between the photograph and the 3D model. Thus we have four sets of values for X_{c}, Y_{c}, X_{s}, Y_{s}, and Z_{s}. We do not have values for Z_{i} in the camera coordinate system. By substituting for Z_{i} in the equations for X_{c} and Y_{c}, the need to solve for Z_{i} is eliminated.
A different derivation of the collinearity equations is provided by Mikhail and is described below.
Starting with the 3D transformation below, the system is turned into a 2D/3D transformation by adding a uniform scaling factor (a) and setting the image Z value, Z_{i}, to –f, the focal length of the camera.
becomes
Writing this in equation form
Division of the first two equations by the third equation eliminates the scaling factor a.
The least squares solutions for these equations are provided in the other documents.
References:
Mikhail, E.M., Bethel, J.S., McGlone, J.C.; Introduction to Modern Photogrametry, 479 pages, John Wiley and Sons, 2001.
Larson, R.E., Edwards, B.H.; Elementry Linear Algebra, 568 pages, D.C. Heath and Company, 3^{rd} Edition, 1996.
Wolf, P.R., Dewitt, B.A.; Elements of Photogrammetry with Applications in GIS, Edition, McGraw Hill, Inc., 3^{rd} Edition, 2000,
Lionel White Original 1/2007 modified 8/27/2008
