IEEE Access, cilt.13, ss.190354-190370, 2025 (SCI-Expanded, Scopus)
Face presentation attack detection (FPAD) is crucial for ensuring the accuracy and reliability of face recognition and verification systems. However, existing datasets in the literature exhibit limitations in the diversity of presentation attack types they encompass. Deep learning-based solutions that perform well on one dataset may not generalize well to others due to domain shift issues. A potential approach to improve the robustness of deep learning models involves expanding dataset diversity through multi-source data collection. However, this process is hindered by challenges such as extensive data labeling and privacy concerns of centralized data collection. To overcome these challenges, we introduce Fed-StackFPAD, a federated learning (FL) framework for FPAD that integrates a pretrained self-supervised transformer-based masked autoencoder with stacking-based ensemble learning. By leveraging model aggregation across multiple clients, the proposed framework enhances performance without requiring centralized data sharing. The meta-model in the stacking phase combines complementary strengths of the federated model and data center-specific models in the ensemble. Experimental evaluations using 4 different datasets demonstrate that Fed-StackFPAD outperforms existing state-of-the-art methods, highlighting its effectiveness in addressing domain adaptation challenges through FL, self-supervised masked image modeling using vision transformers, and stacking-based model aggregation. Fed-StackFPAD demonstrates a significant improvement, lowering the average half total error rate (HTER) by 13.7% over the current best FL methods.