A Novel Hybrid House Price Prediction Model


AKYÜZ S., EYGİ ERDOĞAN B., Yildiz O., Atas P. K.

COMPUTATIONAL ECONOMICS, cilt.62, sa.3, ss.1215-1232, 2023 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 62 Sayı: 3
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1007/s10614-022-10298-8
  • Dergi Adı: COMPUTATIONAL ECONOMICS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus, IBZ Online, International Bibliography of Social Sciences, ABI/INFORM, EconLit, INSPEC, zbMATH
  • Sayfa Sayıları: ss.1215-1232
  • Anahtar Kelimeler: Housing pricing, Support vector regression, K-means clustering, K-NN classification, DETERMINANTS, REGRESSION
  • Marmara Üniversitesi Adresli: Evet

Özet

The real estate sector is evolving and changing rapidly with the increase in housing demand, and new luxury housing projects appear every day. The reliability of housing market investments is largely dependent on accurate pricing.The aim of this study is to introduce a dynamic pricing procedure that estimates house prices using the most important characteristics of a house. For this purpose, a hybrid algorithm using linear regression, clustering analysis, nearest neighbor classification and Support Vector Regression (SVR) method is proposed. Our hybrid algorithm involves using the output of one method as the input of another method for home price prediction to deal with the heteroscedastic nature of the housing data. In other words, the aim of this study is to present a hybrid algorithm that will create different housing clusters from the available data set, classify the houses to which the cluster is unknown, and make price predictions by creating separate prediction models for each class. Housing data collected through manual web scraping of Kadikoy district in Istanbul were used for training and validation of the proposed algorithm. In addition to these data, we validated our algorithm on the KAGGLE house dataset, which covers a wide range of features. The results of the hybrid algorithm were compared using multiple linear regression, Lasso, ridge regression, Support Vector Regression (SVR), AdaBoost, decision tree, random forest and XGBoost regression. Experimental results show that the proposed hybrid model is superior in terms of both Residual Mean Square Error (RMSE), Mean Absolute Value Percent Error (MAPE) and adjusted Rsquare measures for both Kadikoy and KAGGLE housing dataset.