Chapter 2

Basic Matrix Algebra

Linear Transformations and Matrix Manipulation

The geometrical laws in terms of the first order theory may be stated in the form of linear transformations which in turn are most easily treated by matrix algebra. In particular we may represent the physical situation in terms of two linear equations. The two variables we shall call y and alpha. The physical meaning attached to these two variables will be stated later. We will be able to show that two other variables y ' and alpha ' are related to y and alpha through two linear equations of the form:
$$y = b_{11} y'+ b_{12} \alpha',\eqno(1)$$ (1)

$$\alpha = b_{21} y'+ b_{22} \alpha',\eqno(2)$$ (2)

where b11, b12, b21, and b22 are constants, which we will later show to be characteristic of a given optical system. These constants may be written as the elements of a 2 x 2 matrix,
$$\bmatrix{b_{11}&  b_{12}\cr b_{21}&  b_{22}}.$$

The linear equations (1) and (2) can be written as the matrix equation
\bmatrix{b_{11}&  b_{12}\cr b_{21}&  b_{22}}
\bmatrix{y'\cr\alpha'}.\eqno(3)$$ (3)

In fact, equations (1) and (2) constitute a definition of equation (3).

Now suppose that a further linear transformation is known which connects the variables y '' and alpha '' say, to y ' and alpha ' such that
$$y' = c_{11} y''+ c_{12} \alpha''.\eqno(4)$$ (4)

$$\alpha' = c_{21} y''+ c_{22} \alpha''.\eqno(5)$$ (5)

or in matrix notation
c_{11}&  c_{12}\cr
c_{21}&  c_{22}}
\bmatrix{y''\cr\alpha''}.\eqno(6)$$ (6)

We can discover the equations which relate y and alpha to y '' and alpha '' by substituting equations,(4) and (5) into (1) and (2). Then we have

$$y = b_{11} (c_{11} y''+ c_{12} \alpha'')
+ b_{12}  (c_{21} y''+ c_{22} \alpha''),$$
$$\alpha = b_{21} (c_{11} y''+ c_{12}
\alpha'')+ b_{22}  (c_{21} y''+ c_{22} \alpha''),$$

$$y = (b_{11} c_{11}  + b_{12} c_{21})y
+ (b_{11} c_{12} + b_{12}c_{22})\alpha, \eqno(7)$$ (7)

$$\alpha = (b_{21} c_{11}  + b_{22} c_{21})y
+ (b_{21} c_{12} + b_{22}c_{22})\alpha, \eqno(8)$$ (8)

which is a new transformation of the form
$$y = a_{11} y''+ a_{12} \alpha'',\eqno(9)$$ (9)

$$\alpha=a_{21}y''+a_{22}\alpha'',\eqno(10)$$ (10)

In matrix notation,
\bmatrix{a_{11}&  a_{12}\cr a_{21}&  a_{22}}
\bmatrix{y''\cr\alpha''}.\eqno(11)$$ (11)

where the "a" coefficients are defined by comparing equations (9) and (10) with equations (7) and (8).

The matrix formulation involves the substitution of the expression for the column matrix


of equation (6) into equation (3) to give
b_{11}&  b_{12}\cr b_{21}&  b_{22}}\bmatrix{c_{11}&  c_{12}\cr
c_{21}&  c_{22}}\bmatrix{y''\cr\alpha''}.\eqno(12)$$ (12)

Thus comparing equations (11) and (12) we have
$$\bmatrix{a_{11}&  a_{12}\cr a_{21}&  a_{22}}
=\bmatrix{b_{11}&  b_{12}\cr b_{21}&  b_{22}}\bmatrix{c_{11}&
c_{12}\cr c_{21}&  c_{22}}.\eqno(13)$$ (13)

This essentially defines the usual rule of matrix multiplication given by the coefficients of equation (7) and (8) as:
$$a_{11}=b_{11} c_{11} + b_{12} c_{21}$$
$$a_{12}=b_{11} c_{12} + b_{12} c_{22}$$
$$a_{21}=b_{21} c_{11} + b_{22} c_{21}$$
$$a_{22}=b_{21} c_{12} + b_{22} c_{22}$$

or more compactly,
$$a_{ik}=\sum_{j=1}^{2} b_{ij} c_{jk}.\eqno(14)$$ (14)

The matrix equation may be written as
$$[{\bf A}]= [{\bf B}] [{\bf C}].$$


  1. Show that
  2. Show that
  3. Show that in general
    $[{\bf A}]  [{\bf B}]\ne [{\bf B}] [{\bf A}]$.
  4. For what particular matrix B does
    $ [{\bf A}] [{\bf B}] =  [{\bf B}] [{\bf A}]$?
  5. Show that
  6. Show that
  7. Show that the multiplication by the matrix

    and any matrix
    $\bmatrix{a_{11}&a_{12}\cr a_{21}& a_{22}}$

    always leaves the element a21 unchanged.
  8. Compute the product matrix

Determinant of a Matrix

We shall have occasion to make use of one important property of matrices which involves the determinant of a matrix.

The determinant of the matrix [A] is written as

$$\detr{{\bf A}}=
\detr{a_{11}&  a_{12}\cr a_{21}&  a_{22}}$$

and is defined as the number (a11 a22 - a21 a12).

The theorem that we are interested in states that the determinant of the product of a number of matrices is equal to the product of the determinants of each of the matrices forming the product. We leave the demonstration of this theorem to the exercises below.


  1. Show that
  2. Show that

  3. Show that if [C] = [B] [A] then |B| |A| = |A| |B| = |C|.

  4. Extend the result of exercise 11 to show that for N matrices such that
    [P] = [A1] [A2] . . . [An],
    |P| = |A1| |A2| . . . |An|.

  5. Show that the result of exercise 12 holds for the product matrix of exercise 8 above.

We are not going to be concerned with other properties of matrices. The interested student may find these in any standard text on matrix algebra.