IEEE Access, cilt.11, ss.127302-127316, 2023 (SCI-Expanded)
In contemporary research, high-dimensional data has become more popular in many scientific fields with the rapid advancement of technology in collecting and storing large datasets. As in any modeling process with high-dimensional data, it is very important to accurately identify a subset of the features and reduce the dimensionality in the Cox modeling process in the case of high-dimensionality. Numerous penalized techniques for the Cox model with high-dimensional data have been developed to handle the multicollinearity problem and decrease variability. Adaptive Elastic-net is one of the penalized methods used for feature selection that both handles the grouping effect and has the oracle property. However, providing these advantageous properties of Adaptive Elastic-net for variable selection in the Cox model depends on the optimal selection of hyperparameters, α, and λ values. For this reason, the appropriate selection of these parameters is quite important. Hyperparameters are generally selected by maximizing k-fold cross-validated log partial likelihood based on grid search over ( α, λ ) for the model. However, this method does not guarantee optimal αand λ values. In grid search, hyperparameters are typically allowed to take values specified in a limited sequence in a grid. The purpose of this study is to propose a novel method to determine the optimum hyperparameters ( α, λ ) pair of Adaptive Elastic-net for variable selection in the Cox model with high dimensional data based on modified particle swarm optimization (MPSO). The introduced metaheuristic-based method has been evaluated by extensive simulation studies by comparing it with different traditional penalized methods using various evaluation criteria under different scenarios. According to the comprehensive simulation study, the proposed method outperforms other penalized methods in terms of both variable selection and prediction and estimation accuracy performance for the Cox model in investigating the high-dimensional data.