nextupprevious
Next:Feature extractionUp:Isotropic mappingPrevious:Perceptual grouping

Structure extraction - Feature selection

We extract the following features hierarchically in an unconstrained environment, i.e., with no constraints on the viewing angle and depth, using the approach detailed in [8]: line segments, longer linear lines, coterminations, ``L'' junctions, ``U'' junctions, parallel lines, parallel groups and ``significant'' parallel groups (figure 1(a) - (f)). As an enhancement to that approach, we also extract closed figures comprised of polygons (figure 1(g)). Perceptual grouping rules of similarity, continuity, parallelism and closure are used to extract these features.
 
(a) Longer linear line
(b) Coterminations
(c) ``L'' junctions

 
 
 
(d) ``U'' junction 
(e)  ``U'' junction 

 
 
(f) Parallel groups 
(g) Polygons

  Figure:Visualization of the groupings. (a) Longer linear line (b)  Coterminations (c) ``L'' junctions (d) ``U'' junction (e) ``U'' junction (f) Parallel groups (g) Polygons

Burns edge detector [15] is used to detect straight line segments in an image. Longer linear lines are obtained by the extension of approximately collinear fragmented line segments that either overlap or are close to each other. The lines obtained are further pruned to eliminate lines that are very small or have low edge strength. All other features are extracted using the longer linear lines. A set of non-parallel lines terminating at a common point is called a cotermination. In practice, a small neighborhood is constructed around a point to allow the lines to terminate in a small common region for cotermination extraction.

The cotermination is an important relation. According to the proximity rule of perceptual grouping, the human visual system easily groups coterminous lines. In fact, it has been suggested that the major function of eye movements is to determine coterminous edges [16]. Cotermination is a non-accidental relationship and, hence, reflects significant structural information. Coterminations are grouped to extract ``L'' junctions, and ``L'' junctions are grouped to get ``U'' junctions.

Parallel groups are obtained by constraining the amount of the overlap of the orthogonal projections of parallel lines onto each other and their projections along the x- and y-axis, while incorporating differences in the local and intrinsic orientation of the lines. In other words, we group the parallel lines that significantly overlap each other. ``Significant'' parallel groups are extracted by further constraining the search to only those parallel groups in which at least one member line is enclosed by an ``L'' or ``U'' junction, while accommodating the obliqueness of the viewing angle.

Polygons are closed figures formed by non-parallel lines. A polygon is a significant image relation. According to the closure rule of perceptual grouping, human vision tends to complete curves to form enclosed regions [10]. Extracting closed figures corresponds to this feature of human vision. Polygons are non-accidental image relationships, since the coterminations forming them are non-accidental. Hence, polygons represent significant structure in an image.

Elements of graph theory [17] are employed to extract polygons from an image using the cotermination graph. The underlying idea is to take advantage of the one-to-one correspondence between the closed figures comprised of line segments and the circuits in the graph. A set of fundamental circuits is searched and extracted.

Let $G = (V, E)$ be a cotermination graph, where $V$ and $E$ are the set of vertices and the set of edges of $G$, respectively. Let ${\tilde{\bf e}}_{ij} \in E$ be an edge connecting vertices ${\tilde{\bf v}}_i$${\tilde{\bf v}}_j$$\in V$. The weight of ${\tilde{\bf e}}_{ij}$ is defined as $w({\tilde{\bf e}}_{ij}) = deg({\tilde{\bf v}}_{i}) + deg({\tilde{\bf v}}_j)$, where $deg(\cdot)$ is the degree of a vertex, that is, the number of edges incident with the vertex. The edge weights are collected by extracting the adjacency matrix of the graph. The connected components of the graph are found, and each sub-graph corresponding to each component is processed separately. The weight of a spanning tree is the sum of the weights of all the branches in the tree. We search for the maximal spanning tree, which may be found by slightly altering the minimal spanning tree algorithm to incorporate the vertices resulting in maximal-weight spanning tree [17]. The maximal spanning tree is employed to extract the fundamental circuits. Each fundamental circuit represents a closed figure in the image, where edges on this circuit correspond to line segments on the closed figure.

A polygon is defined to be that fundamental circuit extracted that meets the following requirements: (a) the polygon is simple, i.e., the edges of the polygon do not intersect among themselves, (b) the polygon is relatively compact, (c) the polygon does not have many cavities, and (d) the number of edges on the polygon does not exceed a given threshold.

The importance of perceptual grouping for typical instances of recognition can not be overemphasized. In the absence of the necessary information for perceptual grouping, it is difficult for humans to make an intelligent decision regarding the structure or recognition of an object. Experiments conducted with line drawings, in which most of the elements of significant collinearity, end point proximity, parallelism and symmetry were removed, demonstrated the difficulty perceived by humans subjects in recognizing the objects [10]. With the addition of a few elements at key locations, the human subjects were able to perceive the line drawings with remarkable ease. When the elements were added at locations that did not lend themselves to meaningful perceptual groupings, then the response time for the perception of the line drawings was unaltered. The ability to influence recognition times by controlling the formation of perceptual grouping illustrates the search-based nature of this process, and it has been hypothesized that perceptual grouping can be a key element in search space and recognition time reduction.


nextupprevious
Next:Feature extractionUp:Isotropic mappingPrevious:Perceptual grouping
Qasim Iqbal 2001-05-06