next up previous
Next: Integration Framework Up: Image Retrieval via Isotropic Previous: Color histogram


Anisotropic mapping

In most quantitative channel energy models of texture analysis, an image is processed by channel selective filters along certain fundamental stimulus dimensions such as spatial frequency and orientation. These channels generally contain a non-linearity, such as full-wave rectification, so that they signal the local contrast energy within the bandpass of the channel.

Texture analysis via a channel energy model employing a Gabor filter bank is considered a representation of anisotropic mapping. The representation is accomplished by the extraction of the feature vector ${\bf X_{\cal T}}
\in \Re^{48}$, which measures the fractional energy in various spatial channels after treating the input image with the Gabor filter bank. That can readily be verified from the fact that the translation of an image $I({\bf r}) \to I(\tau_{\bf b}({\bf r}))$ transforms the Fourier transform of the image ${\cal I}({\bf\nu}) \to {\cal I}({\bf\nu})\:e^{j2\pi <{\bf b},{\bf\nu}>}$, where ${\bf b}\in \Re^2$ and rotation of $I(r) \to I(R_{\theta}(r))$, where ${\bf r} = \{x,y\}$ - the space domain coordinates, transforms the Fourier transform ${\cal I}({\bf\nu}) \to {\cal I}(R_{\theta}{\bf\nu})$, where ${\bf\nu} =\{u,v\}$ are the Fourier domain co-ordinates. Similar result holds for reflection. Hence, texture analysis is not invariant after the action of $E(2)$ on an image.

The channel energy model employed is based upon multiresolution analysis that is characterized by both orientation and scale. The $LAB$ space is used for multiresolution texture analysis by measuring the fractional energies in the lightness and the two chrominance channels mentioned in the last section. Given an image $I$, the convoluted sequence $\{I \ast f_{mn}\}$ defines the multiresolution image texture characteristics, where $f_{mn}$ denotes a base texture extraction function $f$ at scale $m$ and orientation $n$, and $\vert\vert f_{mn}\vert\vert^2$ (filter energy) is held constant.

Gabor filters have been used to represent $f_{mn}$. The impulse response of an even-symmetric 2-dimensional Gabor filter is expressed as:

\begin{displaymath}
f(x,y) = \frac {1} {2 \pi \sigma_x \sigma_y} e^{- {\frac {1}...
...{\sigma_x^2}} + {\frac {y^2} {\sigma_y^2}})} \cos(2 \pi u_0 x)
\end{displaymath} (12)

where $f(x,y)$ represents the response at spatial locations $x$ and $y$, $u_0$ is the frequency of a sinusoidal plane wave along the x-axis (i.e., the $0^0$ orientation), and $\sigma_x$ and $\sigma_y$ are the spreads of the Gaussian envelope along the x- and y-axis, respectively.

A set of self-similar Gabor filters is obtained by appropriate rotations and scalings of $f(x,y)$ through the generating function:

$\displaystyle \acute{f}_{mn}(x,y) = k^{-m} f(k^{-m}\acute{x},k^{-m}\acute{y}), \:\:\: k \geq 1$     (13)

where $m$ and $n$ are integers, $\acute{f}_{mn}(x,y)$ is the rotated and scaled version of the original filter, $k$ is the scale factor, $n = {0, 1, \cdots, N - 1}$ is the current orientation index, $N$ is the total number of orientations, $m = {0, 1, \cdots, M-1}$ is the current scale index, $M$ is the total number of scales, and $\acute{x}$ and $\acute{y}$ are the rotated coordinates: $\acute{x} = x \cos \theta + y \sin \theta,\:\:\: \acute{y} = -x \sin \theta + y \cos \theta$ where $\theta = \frac {n \pi} {N}$ is the orientation. The scale factor $k^{-m}$ ensures that the filter energy is independent of $m$. In order to eliminate the sensitivity of the filters to absolute intensity values, we set $F_{mn}(0,0) = 0$. A total of 16 Gabor filters are selected, with 4 filters in equi-angular orientations at 4 different scales, i.e., $N = 4$, and $M = 4$, starting at $0^o$ orientation. Parameters $\sigma_u$, $\sigma_v$ and $k$ are calculated as described in [20].

Channels $L$, $A$ and $B$ are treated with the Gabor filter bank described by equation 13. The 48-dimensional feature vector ${\bf X_{\cal T}}$ is constructed using the fractional energies in each of the 16 spatial-frequency channels in the $L$, $A$ and $B$ channels, i.e.,

\begin{displaymath}
\begin{array}{l}
{\bf X_{\cal T}} =
(\tilde{\bf x}_{{{\cal ...
...}_{00}},\cdots, \tilde{\bf x}_{{{\cal T}B}_{33}})^t
\end{array}\end{displaymath} (14)

where $\tilde{\bf x}_{{{\cal T}L}_{mn}}$, $\tilde{\bf x}_{{{\cal T}A}_{mn}}$ and $\tilde{\bf x}_{{{\cal T}B}_{mn}}$ represent the fractional energy at the output of the filter in the $n^{th}$ orientation and the $m^{th}$ scale, for $L$, $A$ and $B$ channels, respectively. The fractional energy $\tilde{\bf x}_{{{\cal T}L}_{mn}}$ is given as:
\begin{displaymath}
\tilde{\bf x}_{{\cal T}L_{mn}} \: = \: \frac {\sum^{W_y-1}_{...
...\sum^{W_y-1}_{y=0} \sum^{W_x-1}_{x=0} \hat{L}^{\:2}_{mn}(x,y)}
\end{displaymath} (15)

where $\hat{L}_{mn}$ is the $L$ channel treated with filter $\acute{f}_{mn}$, $W_x$ is the width of the image, $W_y$ is the height, and $\sum^{M-1}_{m=0} \sum^{N-1}_{n=0} \tilde{\bf x}_{{\cal T}L_{mn}} = 1$. Due to the fact that the Fourier transform is a linear isometry (for space and spatial-frequency domains), equation 15 represents energy calculation in the space domain. Similar expressions hold for $\tilde{\bf x}_{{\cal T}A_{mn}}$ and $\tilde{\bf x}_{{\cal T}B_{mn}}$. This feature space is also represented by a unit hypercube.


next up previous
Next: Integration Framework Up: Image Retrieval via Isotropic Previous: Color histogram
Qasim Iqbal 2001-05-06