An efficient preprocessing stage for the relationship-based clustering framework

BİLGİN, TURGAY; Camurcu, Ali

doi:10.3233/ida-2010-0449

An efficient preprocessing stage for the relationship-based clustering framework

Atıf İçin Kopyala

BİLGİN T. T., Camurcu A. Y.

INTELLIGENT DATA ANALYSIS, cilt.14, sa.6, ss.731-748, 2010 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 14 Sayı: 6
Basım Tarihi: 2010
Doi Numarası: 10.3233/ida-2010-0449
Dergi Adı: INTELLIGENT DATA ANALYSIS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.731-748
Marmara Üniversitesi Adresli: Evet

Özet

The goal of this study was to develop an efficient clustering framework for processing high-dimensional datasets with reasonable memory and computing power requirements. Strehl and Ghosh proposed a novel clustering approach and developed a framework which is called "relationship-based clustering framework" [1]. In this study, a preprocessing system has been implemented on top of their approach and it has been integrated into the relationship-based clustering framework. Three different benchmark datasets were used to evaluate its efficiency. The results are presented in various tables and charts, and in addition CLUSION graphs are plotted to enable visual evaluation of cluster quality. It is demonstrated that CPU and memory usage has been substantially decreased compared with Strehl and Ghosh's framework 1, without any noticeable decrease in clustering quality. This fact enables the use of the relationship-based clustering framework for much larger datasets than was heretofore possible, and also increases its scalability with respect to number of dimensions.