

Next:BibliographyUp:Applying
perceptual grouping toPrevious:Estimation
of
Training images - Parameter estimation
A total of 30 training images were used (20% of the image set) with 10
training images for each of the three classes. The remaining 120 images
(80%) were used for testing. The feature vector,
,
where
,
(designated here with the subscript
to denote its class), is extracted from these training images. The parameters
and
are estimated using maximum likelihood estimation [25],
i.e.,
![\begin{displaymath}{\bf {\mu}}_{i} = E[{\bf {X}}_{i}] \end{displaymath}](img116.png) |
(30) |
![\begin{displaymath}\Sigma_{i} = E[({\bf {X}}_{i} - {\bf {\mu}}_{i})({\bf {X}}_{i} - {\bf {\mu}}_{i})^t] \end{displaymath}](img117.png) |
(31) |
where
is the expectation operator. startsection section1@1.0ex plus 1ex minus
.2ex.2ex plus .2exResults obtained This section describes the results of
the two experiments performed on the database. In generating these results
we have assumed that the a priori probabilities
's
are equal. The images present in the database were specifically selected
to represent an approximately equal a priori distribution. The database
contains 150 images, of which 55 are building images, 51 are non-building
images, and 44 are intermediate images. Images used in the training phase
were not used in either experiment.
The first experiment measured the recall and the precision.
Recall is defined as the fraction of the total number of building images
that are retrieved correctly by the system. Precision is defined
as the fraction of images retrieved that are actually building images.
In this experiment the term
in equation 29 may be ignored, hence,
.
Some of the images retrieved by the system that are classified as building
images are shown in figure 3. Recall
and precision are shown in table 2.
The first column shows the three classes. The second, third and fourth
columns show the number of images (T) in each of the three classes, the
number of images retrieved (R) in the respective classes, and the number
of correct images (C) in the set of images retrieved, respectively.
 |
Figure 3: Some of the building images retrieved.
The system retrieved a set of 43 images as buildings images. Of these,
36 images were actually building images. Therefore, the system has a recall
of 80% (36/45), and a precision of 83.72% (36/43) for the building class.
Similarly, values of recall and precision for the other two classes are
also shown in table 2.
In the second experiment the ``best matches'' for the three classes were
retrieved. Images were sorted in descending order on the corresponding
value of
,
(
in equation 29 cannot be ignored
now), hence,
.
The best matches are analyzed in table 3.
The first column shows the three classes. The number of images that actually
belong to a particular class within the best matches are shown in ranges
of 20 images in
-
columns of the table.
Efficiency is defined as the number of images
(M) that actually belong to a particular class that are obtained in the
first T best matches for that class, expressed a fraction. These
values are shown in
-
columns of the table. startsection section1@1.0ex plus 1ex minus .2ex.2ex
plus .2exConclusions This paper has presented perceptual grouping as an
effective tool in the content-based image retrieval framework for the retrieval
of images based on the semantic interrelationships of different primitive
image features. A methodology for the application of the perceptual grouping
rules for the retrieval of building images from a database of still monocular
grayscale outdoor images is illustrated to serve as an example. The images
were taken from a ground-level camera.
The system analyzed each of the images to extract features that were
strong evidence of the presence of buildings. These features are generated
by the strong boundaries typical of the different structures that comprise
the building. The features, which are specific shapes of corners, junctions
and parallels, are obtained by perceptual grouping of primitive image features,
by bottom-up processing. A Bayesian framework analyzed these features and
retrieved images which it perceived to be building images. Results obtained
are encouraging for pursuing future work in applying higher-level semantic
knowledge for image retrieval.


Next:BibliographyUp:Applying
perceptual grouping toPrevious:Estimation
of
Qasim Iqbal 2001-03-01