Home > Hubness, Visualization > The Hub Word-Cloud

The Hub Word-Cloud

October 23rd, 2012

My research has recently been going in several different directions.
I have generated some topic word clouds from my papers to help in summarizing what it is about.

First of all, there is clustering and exploiting the hubness, as a consequence of the dimensionality curse, for improving the clustering performance in high-dimensional data. It is an extension on my work first presented in the awarded PAKDD paper The Role of Hubness in Clustering High-dimensional Data.

Hubs in Clustering

Then, there is the work I did on metric learning under the assumption of hubness, which yielded some surprisingly good results. I will probably soon give some updates on that. The approach which I proposed can essentially be viewed as an extension of a cosine similarity used in collaborative filtering, so that the influence of hub-points is taken into account and properly modeled.

Shared neighbor distances for high-dimensional data

We have also worked with the cross-lingual supervised document retrieval, where hubs play an important role.

Cross-lingual hubness-aware document retrieval

Last, but not least – is my work on improving the k-nearest neighbor classification in high-dimensional data. I have explored several ways of doing so and I am in the process of revising some of the proposed approaches in order to better model the co-occurrences and provide a more robust alternative to other nearest neighbor methods. The initial tests, however, show that the hubness-aware kNN classification is in fact very good even in its basic (proof of concept) form and outperforms many standard methods in high-dimensional data.

Hubness-aware classification

Categories: Hubness, Visualization Tags:
Comments are closed.