Social media analysis by innovative hybrid algorithms with label propagation

ALTINEL GİRGİN, AYŞE

doi:10.1016/j.eswa.2022.118606

Social media analysis by innovative hybrid algorithms with label propagation

Atıf İçin Kopyala

ALTINEL GİRGİN A. B.

EXPERT SYSTEMS WITH APPLICATIONS, cilt.210, 2022 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 210
Basım Tarihi: 2022
Doi Numarası: 10.1016/j.eswa.2022.118606
Dergi Adı: EXPERT SYSTEMS WITH APPLICATIONS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, INSPEC, Metadex, Public Affairs Index, Civil Engineering Abstracts
Anahtar Kelimeler: Label propagation algorithm, Social media analysis, Topic -based tweet classification, Sentiment polarity detection
Marmara Üniversitesi Adresli: Evet

Özet

Due to the huge size of the data accumulated on microblogging sites, recently, two fundamental questions have become very popular: 1) What percentage of this accumulated data has positive or negative sentiment polarity? 2) How is the distribution of this accumulated data on different topics? Inspired by these motivated necessities, this paper presents several different algorithms which are based on the Label Propagation Algorithm (LPA) in order to handle previously mentioned two fundamentals tasks: sentiment polarity detection task and topic-based text classification task. These algorithms are the Label Propagated-Relevance Frequency Classifier (LP-RFC) and LP-Abstract Frequency Classifier (LP-AFC). These algorithms can be defined as new semantic smoothing classi-fiers, which take advantage of the semantic connections among terms in the label propagation phase of the LPA. Additionally, another classifier, namely LP-ComRFC+AFC, was built. LP-ComRFC+AFC is actually a weighted sum-mation classifier of the individual LP-RFC and LP-AFC. Furthermore, considering the shortage of labeled data in real-world scenarios, a semi-supervised version of LP-RFC and LP-AFC, namely ???Merging Unlabeled and Labeled Instances with Semantic Values of Terms??? (MULIS), was designed and implemented. For the experiments of the sentiment polarity detection task, three different datasets were use and for the experiments of topic-based text classification task, a self-collected tweet dataset was use. According to the experimental results, the suggested algorithms, and their composite form, LP-ComRFC+AFC, generated higher F1 scores than all of the baseline al-gorithms at nearly all of the training splits on the datasets.