Preprocessing Framework for Twitter Bot Detection

Kantepe M., GANİZ M. C.

2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Türkiye, 5 - 08 Ekim 2017, ss.630-634 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Basıldığı Şehir: Antalya
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.630-634


One of the important problems in social media platforms like Twitter is the large number of social bots or sybil accounts which are controlled by automated agents, generally used for malicious activities. These include directing more visitors to certain websites which can be considered as spam, influence a community on a specific topic, spread misinformation, recruit people to illegal organizations, manipulating people for stock market actions, and blackmailing people to spread their private information by the power of these accounts. Consequently, social hot detection is of great importance to keep people safe from these harmful effects. In this study, we approach the social hot detection on Twitter as a supervised classification problem and use machine learning algorithms after extensive data preprocessing and feature extraction operations. Large number of features are extracted by analysis of Twitter user accounts for posted tweets, profile information and temporal behaviors. In order to obtain labeled data, we use accounts that are suspended by Twitter with the assumption that majority of these are social hot accounts. Our results demonstrate that our framework can distinguish between hot and normal accounts with reasonable accuracy.