Archive

Archive for the ‘Data Mining’ Category

A publication in the IEEE Transactions on Knowledge and Data Engineering

April 3rd, 2013 Comments off

Our clustering paper was recently accepted for publication in the IEEE TKDE Journal (IF. 1.7). It is titled The Role of Hubness in Clustering High-Dimensional Data and is an extended version of the paper that was previously awarded the Best research paper runner-up award at the PAKDD 2011 conference in Shenzhen, China.

We have shown that hubs in the kNN topology can be successfully exploited for data clustering. Furthermore, we have shown that point hubness (neighbor node degree in the kNN graph) is a much better measure of local cluster centrality than density, under the assumption of high intrinsic data dimensionality.

We have analyzed three proof-of-concept hubness-based clustering algorithms (K-hubs, global hubness-proportional clustering and global hubness-proportional K-means).
The experimental evaluation confirms the robustness of hubness-based clustering and suggests that this might indeed be a promising research direction

GHPC clustering process

GHPC clustering process

The change in cluster entropy (non-homogeneity) with increasing levels of noise. The proposed methods greatly outperform the standard K-means++ algorithm.

The change in cluster entropy (non-homogeneity) with increasing levels of noise. The proposed methods greatly outperform the standard K-means++ algorithm.