Many workers in the field of Image Processing
borrow techniques from the field of signal
processing.
Signals (such as radio signals, sounds etc..)
are a function of time. I.e. The amplitude (or
some other dependent variable) changes with time.
In images, we have a variable (A single value for
a monochrome image, a vector of three values for
a color image) which changes across the width and
height of the image. This means that whereas the
radio signal was a "one dimensional" function
(i.e. dependent upon one value - time) the image
is now a "two dimensional" function - (i.e. dependent
upon both the x and y locations of each pixel).
We can generalise this still further - a motion
picture represents a function in three dimensions,
as does a volumetric medical scan, and an animation
of a volumetric medical scan is a function in four
dimensions. If the medical scan animation contains wavelength/frequency information, we can generalise
to five or more dimensions.
In any case, we can use the same mathematical
techniques that we have been using for a century or
more to analyse radio and electrical signals to
analyse these newly available higher dimensionality
signals.
The most traitional technique for signal processing
is the fourier transform. Other techniques include
(but are not limited to) the cosine transform, the
haar wavelet wavelet transform, the debauchies
wavelet wavelet transform and the gabor transform,
also sometimes called the gabor wavelet transform.
A fourier transform looks at our signal and tries to
decompose it as a sum of sine waves of varying
amplitudes and frequencies. In two dimensions- a
sum of sine waves of varying amplitudes, frequencies
and directions.
A cosine transform attempts the same thing, but it
tries to decompose the signal into cosine waves
instead of sine waves. JPG compression uses a
cosine transform.
Both sine waves and cosine waves are periodic
functions of infinite extent - they go on forever.
This has drawbacks. It means that each coefficient
of the transform is dependent upon every part of the
image. This is a bad thing, because we want to look
at each part of the image independently.
A wavelet transform attempts to use pulses instead
of waves - a wavelet transform looks at our signal
and tries to decompose it as a sum of pulses or wavelets of varying amplitudes, dilations and
translations. (I.e. pulses with different heights,
widths and locations).
The pulse that is used is called the mother wavelet
and wavelet transforms are classified according to
mother wavelet. Haar wavelets use a step function,
Debauchies wavelets use a wierd looking spikey
Debauchies function, and Gabor wavelets use
a gaussian function - which is the normal distribution
from your schoolboy statistics textbooks.
Wavelet transformss do also, however, differ according to how the mother wavelet is transformed. Traditional wavelet transforms use large discrete translations - which can cause problems analysing objects located between these translation values. They also use large discrete dilations which causes similar problems to the large translation values used. Confusingly, amplitudes usually have plenty of freedom to change - the exception to the rule.
We have a classical engineering tradeoff between the computational complexity (read speed) of the transform computation and the degree of freedom offered to the transform to enable it to represent the objects in the signal or image.
The Gabor transform attempts (and fails) to find an optimal coverage of the transformation space of the mother wavelet -which is usually, but not neccessarily a gaussian distribution.
In real applications we must choose our mother wavelet and our coverage of the transformation space after
careful consideration of the data that we intend to analyse. This is an engineering problem, not a scientific one.
William Payne, 28 Sept 2003.