Building Image Ontologies

May 4th, 2013

During our bilateral project with the Technical Institute in Cluj (Understanding Human Behavior for Video Surveillance Applications) there was a need for us to be able to quickly browse and examine large image collections in order to discover best feature representations for pedestrian recognition in traffic scenes.

This was before I’ve made Image Hub Explorer and we’ve had a slightly different approach in mind. We have decided to extend OntoGen, a tool for semi-automatic ontology construction that was developed at the AILab group at the Jožef Stefan Institute. This would not only have allowed us to build a single simplification of the data, but rather a hierarchical cluster structure that would help in analyzing different levels of granularity, which was definitely useful for the project.

Working with OntoGen revolves about several simple interpretative steps. Clustering is used to split the nodes and build a concept tree. The standard K-means algorithm is used to cluster the data. Each node can be quickly examined by observing a number of its top ‘medoids’, the most central points. For a deeper understanding of the node structure, multi-dimensional scaling (MDS) is performed and all the data is shown in a single viewing panel. The user can view the details by clicking near the desired points. Introducing images to the framework was not very difficult.

Once the feature representation for the images is obtained, the dataset can be loaded into OntoGen and this is the first screen that can be seen, showing a set of central medoids:

Visualizing the data via MDS in Document Atlas (now Image Atlas, obviously) leads to the following view:

We see that different image classes are sort of well separated in the projected space, though a more careful analysis would reveal that the separation is not really perfect. This particular data that I am using in the examples comes from selecting different subsets of the public ImageNet data, as it is a bit more colorful than the pedestrian data which we had in the project.

The user can then proceed by creating some initial higher-level concepts via clustering. This is what happens when the first-level concepts are created:

Each node in the graph view also shows the set of most typical features in the given concept. The shown image features sort of define the images in their cluster. This hints at which textures are most common in the group and can be used to further interpret the results.

However, what usually happens is that not all of the first-level concepts are good enough, some contain a mixture of different classes and should be further split into lower-level sub-concepts in the ontology tree. Indeed, this is possible and very easy to do in OntoGen. The process is the same, K-means clustering on the node data. The user can select the desired number of sub-concepts. Here is an example of one such node, visualized by MDS:

Of course, users should always try several different split strategies, as the ideal number of sub-concepts is not known a priori. Such careful approach usually results in a better separation and higher quality of the concept split. The resulting concept tree can be used to interpret the data. The process is generic, so it can be applied to many different domains. It is also quite intuitive and easy to use, so we hope that the experts would benefit from this tool. It has proven quite useful in our own bilateral project.

Comments are closed.