Colour-Science object detection technology
We have developed several applications which allow us
to detect objects or to segment images in different important image parts. The
object detection implementation inside i2e can also be trained to detect other
objects then faces. We have for example also trained the recognition of eyes.
Colour-Science has a 40’000 image database to train object recognition. In this
database the exact face, eye and mouth coordinates are known.
For image enhancement and image classification it is
very important to detect important objects or parts of the image. Only like
this it is possible to do a good image enhancement.
See also the following documents:
colour-science face detection >>>
colour-science eye detection >>>
The i2e object detection module consists mainly of two
parts.
1) The training tools which are needed to train the
detection engine for some specific objects (eg.
faces, eyes, logos and others)
2) The object recognition engine which will attempt to
recognise the trained objects in bitmap images.
The training tools are mainly written in Microsoft C
or Visual Basic. The imaging and object recognition functions which are used
come either from i2e routines or from optimized Intel routines. The Intel
routines used come from the Intel Open Source Computer Vision Library and use
the speed optimizes Intel Performance Primitives when executed on Pentium 4 or
later Intel processors
The training tools consist of a set of programs used
to mark objects in a image database and then use those
manually marked objects to generate a train file with the exact coordinates of
the objects. In this example we show how the object recognition engine is
trained with faces
To train the object recognition we need about 7000
different faces. We have a digital image database of about 50’000 high
resolution images from different camera brands (Sony, Canon, Nikon, HP ..). Now the first step is to manually go through this
database and mark all faces.
For this we have a software
called: “Object coordinate retrival software”

With this software we select positive sample images
(the ones with faces in it) and mark manually the eyes and the centre of the
mouth. We can also classify negative sample images which are the ones which do
not contain any face.
At the end we have a database which describes for all
of our images if they contain faces and the exact position of the faces.
In a second step we will now extract a selection of
face images. For this we have a second software tool. The software is called “Create
sample training file software” This tool allows you to show the manually
selected eye and mouth coordinates (-> red rectangle) and it will also
create a bigger box shown in green which shows the whole face. The green box
will sort of float around the red box to cover best the whole face region.

If you want now to train the object recognition engine
then you have to export a selection of face images from the image database.
With the software you can export statistics about the ratio of eye_distance/eye_to_mouth_distance, eye tilt angle face
contrast and face size.
Discriminating those values you can for example only
export big frontal faces or you can export tilted faces …
An interesting thing is the ratio of eye_distance/eye_to_mouth_distance. By sorting our face
image database we have seen that this ratio can be used to sort faces by age.
The mean ratio is 0.9 for the whole database. Children have a larger head with
ratios up to 1.3 and adults have normally a slimmer, longer head with ratios down
to 0.7. By extracting this ratio out of images you could sort the images by age
of the people on the image which is quite a nice idea.
You have also to extract negative sample images which
contain no faces at all.
For a good training set about 6000 positive face
samples and 4000 negative samples are needed.
The positive samples can be exported in one database
file. This file will then contain about 24x24pixel black and white subimages of the face regions.
sample
of the face region of the above image
Once you have built your positive and negative image
data sets you can begin to train the classifier.
For this a program called “Train object recognition
engine” is used.

To test the i2e image enhancement functions and object
recognition functions we have a software called “i2e
Static Library Test”.
With this software you can load a selection of images
and perform the enhancement and object recognition functions.

The software allows also exporting all the
intermediate maps which are used during the processing and validation cycles.
Original image:

enhanced
image with detected face rectangles

Skin validation maps of the four face rectangles

edge map

skin color map

shadows map

vegetation map

|
I2e Image enhancement |
For an optimal detection we need to have a good
image quality. Therefore the images are pre corrected. Contrast is stretched.
Underexposures or overexposures are corrected and color casts are removed. |
|
Pre rotation of the image |
Because many images are +-90° rotated the image has
to be presented to the object recognition algorithm in upright position. For
this we need to pre rotate the image. The interesting about this function is
that we can detect rotated images and it would be possible to automatically
rotate them in upright position for the customer. |
|
Feature scaling and scanning of the image |
We need to be able to detect different sizes of
faces. To make this possible the features itself are scaled. |
|
Pre processing of the scan window |
In order to attain a high recognition rate the
scanned image regions must be enhanced itself. By doing this it is for
example possible to correct locally the density of faces in the dark
background. |
|
Object detection |
The scan wildow is slided over the image in different scales and the object
detection probabilities are summed in a probability map |
|
Combination of overlapping detections |
The probability map will normally show several
nearby probability peaks which would result in several overlapping windows.
To avoid this, overlapping windows are approximated with one big rectangle |
|
Validation of detections |
As the last step it is important do remove as much
“false positive” detections as possible. Because the detection algorithm only
looks at density changes one validation is for example to check that a
minimum of skin colored pixels are present in the
rectangle. |