Preprocessing Framework for Twitter Bot Detection

Kantepe M., GANİZ M. C.

2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Türkiye, 5 - 08 Ekim 2017, ss.630-634, (Tam Metin Bildiri)

Yayın Türü: Bildiri / Tam Metin Bildiri
Basıldığı Şehir: Antalya
Basıldığı Ülke: Türkiye
Sayfa Sayıları: ss.630-634
Anahtar Kelimeler: component, sybil account, social bot, bot detection, feature engineering, model construction, machine learning
Marmara Üniversitesi Adresli: Evet

Özet

One of the important problems in social media platforms like Twitter is the large number of social bots or sybil accounts which are controlled by automated agents, generally used for malicious activities. These include directing more visitors to certain websites which can be considered as spam, influence a community on a specific topic, spread misinformation, recruit people to illegal organizations, manipulating people for stock market actions, and blackmailing people to spread their private information by the power of these accounts. Consequently, social hot detection is of great importance to keep people safe from these harmful effects. In this study, we approach the social hot detection on Twitter as a supervised classification problem and use machine learning algorithms after extensive data preprocessing and feature extraction operations. Large number of features are extracted by analysis of Twitter user accounts for posted tweets, profile information and temporal behaviors. In order to obtain labeled data, we use accounts that are suspended by Twitter with the assumption that majority of these are social hot accounts. Our results demonstrate that our framework can distinguish between hot and normal accounts with reasonable accuracy.