Publication:
Spam detection on social networks using deep contextualized word representation

cris.virtual.department#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.orcid#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtualsource.departmenteea2e4cc-dec0-4208-8a70-6316484463e5
cris.virtualsource.orcideea2e4cc-dec0-4208-8a70-6316484463e5
dc.contributor.affiliationKirikkale University; Turkish Aeronautical Association; Turk Hava Kurumu University
dc.contributor.authorGhanem, Razan; Erbay, Hasan
dc.date.accessioned2024-06-25T11:44:49Z
dc.date.available2024-06-25T11:44:49Z
dc.date.issued2023
dc.description.abstractSpam detection on social networks, considered a short text classification problem, is a challenging task in natural language processing due to the sparsity and ambiguity of the text. One of the key tasks to address this problem is a powerful text representation. Traditional word embedding models solve the data sparsity problem by representing words with dense vectors, but these models have some limitations that prevent them from handling some problems effectively. The most common limitation is the out of vocabulary problem, in which the models fail to provide any vector representation for the words that are not present in the model's dictionary. Another problem these models face is the independence from the context, in which the models output just one vector for each word regardless of the position of the word in the sentence. To overcome these problems, we propose to build a new model based on deep contextualized word representation, consequently, in this study, we develop CBLSTM (Contextualized Bi-directional Long Short Term Memory neural network), a novel deep learning architecture based on bidirectional long short term neural network with embedding from language models, to address the spam texts problem on social networks. The experimental results on three benchmark datasets show that our proposed method achieves high accuracy and outperforms the existing state-of-the-art methods to detect spam on social networks.
dc.description.doi10.1007/s11042-022-13397-8
dc.description.endpage3712
dc.description.issue3
dc.description.pages16
dc.description.researchareasComputer Science; Engineering
dc.description.startpage3697
dc.description.urihttp://dx.doi.org/10.1007/s11042-022-13397-8
dc.description.volume82
dc.description.woscategoryComputer Science, Information Systems; Computer Science, Software Engineering; Computer Science, Theory & Methods; Engineering, Electrical & Electronic
dc.identifier.issn1380-7501
dc.identifier.urihttps://acikarsiv.thk.edu.tr/handle/123456789/1159
dc.language.isoEnglish
dc.publisherSPRINGER
dc.relation.journalMULTIMEDIA TOOLS AND APPLICATIONS
dc.subjectSpam detection; Deep learning; Word embedding; Recurrent neural network; Embedding from language model
dc.subjectACCOUNTS
dc.titleSpam detection on social networks using deep contextualized word representation
dc.typeArticle
dspace.entity.typePublication

Files