/ home / articles / matrices
Matrices for Dummies

Introduction

Matrices are very powerful mathematical tools applicable in a wide range of real world problems. Intuitively, the main advantage in using matrices is that they allow the logical manipulation of very large sets of numbers at once. More rigorously, they allow the manipulation (and solution) of linear equation systems which in turn reppresent real world problems.

Just as example, the voltages and currents in an electronic board with 1000 components can be described by a very huge system of linear equations. 1000 components imply 1000 unknown voltages and 1000 unknown currents. Attempting to find such a large number of unknown variables without a method is substantially a suicide. With the use of matrices such a system can be solved in a very elegant way.

Today, matrices are used not simply for solving systems of simultaneous linear equations, but also for describing the quantum mechanics of atomic structure, designing computer game graphics, analyzing relationships, and even plotting complicated dance steps :)

In this article we try to develop our mathematical tool step by step.

Definition

A matrix is an ordered, bidimensional collection of mathematical expressions usually rapresented as a rectangular table.

The horizontal lines in a matrix are called rows and the vertical lines are called columns. A matrix with m rows and n columns is called an m-by-n matrix (written mxn) and m and n are called its dimensions. The dimensions of a matrix are always given with the number of rows first, then the number of columns.

If the number of rows of a matrix equals the number of columns (m = n) then the matrix is said to be square otherwise it's just rectangular. Square matrixes have several interesting properties that we'll talk about later.

The entry of a matrix A that lies in the i-th row and the j-th column is called the i,j entry or (i,j)-th entry of A. This is written as ai,j, aij or A[i,j]. The row is always noted first, then the column.

A special matrix with one of the dimensions set to 1 is called vector. A 1xn matrix is called row vector

while a mx1 matrix is called column vector.

If the entries of a matrix are all real numbers then the matrix is said to be real. If the entries are complex numbers then the matrix is too said to be complex. If the entries are polynomials then (guess what?) the matrix is said to be polynomial too.

The entries of a matrix usually have some associated meaning but for now let's just say they are mathematical expressions (maybe numbers) and concentrate on matrix manipulation.

Let's play with it

We define the matrix sum as an operation that given two mxn matrices A,B returns a mxn matrix C with entries that are sums of the corresponding entries in A and B. Please note that the sum is defined only for matrices of exactly the same dimensions: we say that such matrices are sum-compatible.

For sum-compatible matrices it's obvious that

and

We define the scalar multiplication as an operation that given a mxn matrix A and a scalar expression K returns a mxn matrix B with each entry made of the corresponding entry of A multiplied by K.

It's again obvious that for sum-compatible matrices A,B and any scalar expression k

and for any matrix A and any couple of scalar expressions k1, k2

Food for thoughs: Multiplication by scalar is commutative if the underlying ring (of expressions) is commutative. This is true when the expressions are (real or complex) numbers or polynomials, that is most real-world cases in that matrices are applicable. However, the matrix algebra can be applied also to non commutative rings (for example quaternions) where the multiplication by scalar must be splitted in two different operations: left multiplication and right multiplication.

Not that obvious

We define the matrix multiplication as an operation that given a mxp matrix A and a pxn matrix B returns a mxn matrix C with element i,j computed as the scalar vector product of the i-th row of A and the j-th column of B.

Note that the matrix multiplication is well defined only for couples in that the left matrix has the number of columns equal to the number of rows of the right matrix. We say such two matrices to be multiplication-compatible.

Food for thoughs: the multiplication of two nxn matrices processes 2 n2 entries. However there is no known algorithm with computational cost of O(n2). Most algorithm are O(n3) and the most clever implementations are O(n2.8). An O(n2.376) algorithm has been proposed by Coppersmith and Winograd but its implicit factor hidden by the O() notation is so big that its implementation is worthwile only if we're going to multiply matrices with n that is out of our current computing possibilities.

It's very easy to show that (and here comes the non obvious) the matrix multiplication is generally not commutative, that is

except for very few special cases. The (square) matrices for that

are said to commute and must satisfy strict rules on their elements.

The non commutativity of the matrix multiplication makes the algebraic manipulation to become non trivial and causes infinite headcaches to engineering students.

However, we're lucky since the associative and distributive properties still apply and it can be proven that the following equations are all true (given that the matrices involved are multiplication-compatible and the underlying ring is commutative).

Transpose

We define the transpose of a mxn matrix A as a nxm matrix B obtained from A by swapping rows with columns. The transpose of a matrix A is often written as AT or as A'.

Note that swapping rows with means effectively swapping the order of indices of each element. The element aij of the matrix A becomes the element aji of the transpose.

Food for thoughs: This property is interesting in computer matrix processing. To apply an algorithm to the transpose of a matrix instead of the original one we can simply swap the parameters of all the matrix element access functions...

A matrix whose transpose is equal to itself is called a symmetric matrix; that is, A is symmetric if AT = A. Note that A must be square to be symmetric and internally the elements must satisfy the relation aij = aji.

It's easy to show that

for any matrix A, thus the transposition is a self-inverse operation.

Also for two matrices with the same dimensions

If the matrices A and B are multiplication-compatible then

Note that the order of multiplication is inverted.

And finally taking the transpose of a scalar (1x1 matrix) is a null operation

The identity

A particular square matrix that commutes with all other matrices of the same size is the identity matrix. The identity matrix has all unit elements on its main diagonal.

It's easy to prove that

and thus the identity matrix is the "unity" element of the matrix algebra and the multiplication by the identity matrix is an idempotent operation.

Obviously the transpose of an identity matrix is still an identity matrix.

The inverse

Given a square matrix A we define the inverse matrix of A as the matrix that when multiplied by A gives the identity matrix as result. The inverse matrix is usually written as A-1.

The inverse matrix does not necessairly exist. A matrix that has no inverse is said to be non invertible or singular.

Note that A and its inverse (when it exists) do commute.

Food for thoughs: for non square matrices we can define the left (A-1A=I) and the right inverse (AA-1)=I. Such inverses have few real world applications...

It can be shown that the inverse of a matrix is again invertible and that

for any invertible matrix A and that

for any invertible matrix A and any non null scalar k.

It can be also proven that

for invertible matrices A and B of the same size. Note that the order of factors is inverted and the formula is very similar to the one that involves transposition.

Finding the inverse of a matrix is a very common highly intensive computational task. There are several algorithms that implement this operation and many of them operate better on matrices with elements that satisfy certain properties or conformations.

Matrices and systems of linear equations

A very large application of matrices is in solving systems of linear equations.

The above system of m linear equations in n unknown variables can be rewritten by using matrices as follows

The rows of the A matrix on the left contain the coefficients of the linear equations of the system. The unknown variables and the known terms are collected in row vectors x and b.

Solving the system means finding the set of values x1,x2,...,xn that satisfy all the equations at the same time. In most cases (if the field of the matrix elements is infinite) exactly one of the following statements is true:

  • the system is undetermined (the set of solutions is empty)
  • the system is overdetermined (the set of solutions contains infinitely many elements)
  • the system is exactly determined (the set of solutions contains contain exactly one element)

The most interesting case is obviously the one with exactly one solution. The Rouche'-Capelli theorem states that for a system of linear equations with n unknown variables to be exactly determined there must be at least n linearly independent equations. Since more equations are unnecessary then the most interesting linear systems are described by a square coefficient matrix.

By using trivial manipulations we can now show that by inverting the matrix of coefficients we can solve the system. We multiply boths sides of the system equation by the inverse of A

Since by definition

so the formula above can be rewritten as

and since

then trivially

The last formula is very important since it provides the real motivation for studying the inverse matrices. If we're able to find the inverse of A, we can solve the system.

Determinants

The definition of the matrix determinant is quite scary. Given a square nxn matrix A, it's determinant is defined as

where E(n) is the set of permutations of the numbers {1,2,....n}, P is a single permutation taken out of that set and sgn(P) is a function that returns +1 if the permutation P is even and -1 if the permutation P is odd. P(i), then, is the i-th element of the permutation P.

The formula above is also known as Liebniz formula. and is better explained with an example. Consider the following 2x2 matrix.

In this case n is 2 and thus E(n) is the set of the possible permutations of numbers {1,2}. E(n) obviously contains only two elements: the trivial "null" permutation {1,2} (which is even) and the permutation {2,1} (which is odd).

The Liebniz formula expands then to a sum of two elements dictated by the two permutations just described. Each element of the sum is a product of n elements of the matrix taken from distinct columns. The row of each element is choosen by the permutation.

In the first product we move along the columns (indexed by i) from left to right and take the first and the second row (null permutation). This means a11 and a22. Since the permutation used is even, then the sign of this product is positive.

In the second product we move along the columns from left to right and take the second and then the first row (the odd permutation). This means a21 and a12. Since the permutation is odd, then the sign of this product is negative.

The determinant of the 2x2 matrix is then

This is rather easy to remember if you note that it's the sum of the products of the two diagonals of the 2x2 matrix. The product is positive if the diagonal goes "up" from left to right while it's negative if the diagonal goes "down" from left to right.

The formula for 3x3 matrices is more complex and it contains 6 elements.

If you look close you can still spot the same pattern. There are positive diagonal lines that go down from left to right and negative diagonal lines that go up from left to right.

The formula for 4x4 matrices becomes really complex and is rarely written explicitly. For larger n values the formula becomes very difficult to use because it's hard to enumerate all the permutations of n numbers. Other methods for finding the determinant exist and later we'll probably spot some of them but now let's look at some of the determinant properties.

It can be shown that for square matrices A and B, any scalar r and the square unit matrix I the following properties are true

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

Invertible matrices and their determinants

Since for an invertible matrix A

then by property 4 of the previous paragraph

and since

then

which implies that

and that

Conversely, it can be shown that a non-zero determinant for matrix A implies that the inverse of A exists. The proof requires the notion of the rank of a matrix which we haven't covered so I'm going to omit it here. However, the two deductions lead us to the following theorem:

"A matrix is singular if and only if its determinant is zero"

or alternatively

"A matrix is invertible if and only if its determinant is non zero"

Food for thoughs: As a rule of thumb, almost all square matrices are invertible. Over the field of real numbers, this can be made precise as follows: the set of singular n-by-n matrices, is a null set, i.e., has Lebesgue measure zero. Intuitively, this means that if you pick a random square matrix over the reals, the probability that it will be singular is zero. In practice however, one may encounter non-invertible matrices. And in numerical calculations, matrices which are invertible, but close to a non-invertible matrix, can still be problematic; such matrices are said to be ill conditioned.



here
we're working for you
there
blog
articles
software
pragmaware
pragma
photos
contact
login
past
h 0x1f771d
u 0x1870c8
p 0x002727
future
working on it...