Comparison of Dimension Reduction Techniques on High Dimensional Datasets


YILDIZ K., Camurcu Y., DOĞAN B.

INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, cilt.15, sa.2, ss.256-262, 2018 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 15 Sayı: 2
  • Basım Tarihi: 2018
  • Dergi Adı: INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.256-262
  • Anahtar Kelimeler: High dimensional data, clustering, dimensionality reduction, data mining, FUZZY C-MEANS, CLASSIFICATION
  • Marmara Üniversitesi Adresli: Evet

Özet

High dimensional data becomes very common with the rapid growth of data that has been stored in databases or other information areas. Thus clustering process became an urgent problem. The well-known clustering algorithms are not adequate for the high dimensional space because of the problem that is called curse of dimensionality. So dimensionality reduction techniques have been used for accurate clustering results and improve the clustering time in high dimensional space. In this work different dimensionality reduction techniques were combined with Fuzzy C-Means clustering algorithm. It is aimed to reduce the complexity of high dimensional datasets and to generate more accurate clustering results. The results were compared in terms of cluster purity, cluster entropy and mutual info. Dimension reduction techniques are compared with current Central Processing Unit (CPU), current memory and elapsed CPU time. The experiments showed that the proposed work produces promising results on high dimensional space.