Predicting the Soft Error Vulnerability of Parallel Applications Using Machine Learning

Oz, Isil; Arslan, SANEM

doi:10.1007/s10766-021-00707-0

Predicting the Soft Error Vulnerability of Parallel Applications Using Machine Learning

Atıf İçin Kopyala

Oz I., Arslan S.

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, cilt.49, sa.3, ss.410-439, 2021 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 49 Sayı: 3
Basım Tarihi: 2021
Doi Numarası: 10.1007/s10766-021-00707-0
Dergi Adı: INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, INSPEC, Metadex, MLA - Modern Language Association Database, zbMATH, Civil Engineering Abstracts
Sayfa Sayıları: ss.410-439
Anahtar Kelimeler: Fault injection, Machine Learning, Parallel programming, Soft error analysis
Marmara Üniversitesi Adresli: Evet

Özet

With the widespread use of the multicore systems having smaller transistor sizes, soft errors become an important issue for parallel program execution. Fault injection is a prevalent method to quantify the soft error rates of the applications. However, it is very time consuming to perform detailed fault injection experiments. Therefore, prediction-based techniques have been proposed to evaluate the soft error vulnerability in a faster way. In this work, we present a soft error vulnerability prediction approach for parallel applications using machine learning algorithms. We define a set of features including thread communication, data sharing, parallel programming, and performance characteristics; and train our models based on three ML algorithms. This study uses the parallel programming features, as well as the combination of all features for the first time in vulnerability prediction of parallel programs. We propose two models for the soft error vulnerability prediction: (1) A regression model with rigorous feature selection analysis that estimates correct execution rates, (2) A novel classification model that predicts the vulnerability level of the target programs. We get maximum prediction accuracy rate of 73.2% for the regression-based model, and achieve 89% F-score for our classification model.