Spam detection on social networks using deep contextualized word representation

Ghanem, Razan; Erbay, Hasan

Publication:
Spam detection on social networks using deep contextualized word representation

cris.virtual.department	#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtual.orcid	#PLACEHOLDER_PARENT_METADATA_VALUE#
cris.virtualsource.department	eea2e4cc-dec0-4208-8a70-6316484463e5
cris.virtualsource.orcid	eea2e4cc-dec0-4208-8a70-6316484463e5
dc.contributor.affiliation	Kirikkale University; Turkish Aeronautical Association; Turk Hava Kurumu University
dc.contributor.author	Ghanem, Razan; Erbay, Hasan
dc.date.accessioned	2024-06-25T11:44:49Z
dc.date.available	2024-06-25T11:44:49Z
dc.date.issued	2023
dc.description.abstract	Spam detection on social networks, considered a short text classification problem, is a challenging task in natural language processing due to the sparsity and ambiguity of the text. One of the key tasks to address this problem is a powerful text representation. Traditional word embedding models solve the data sparsity problem by representing words with dense vectors, but these models have some limitations that prevent them from handling some problems effectively. The most common limitation is the out of vocabulary problem, in which the models fail to provide any vector representation for the words that are not present in the model's dictionary. Another problem these models face is the independence from the context, in which the models output just one vector for each word regardless of the position of the word in the sentence. To overcome these problems, we propose to build a new model based on deep contextualized word representation, consequently, in this study, we develop CBLSTM (Contextualized Bi-directional Long Short Term Memory neural network), a novel deep learning architecture based on bidirectional long short term neural network with embedding from language models, to address the spam texts problem on social networks. The experimental results on three benchmark datasets show that our proposed method achieves high accuracy and outperforms the existing state-of-the-art methods to detect spam on social networks.
dc.description.doi	10.1007/s11042-022-13397-8
dc.description.endpage	3712
dc.description.issue	3
dc.description.pages	16
dc.description.researchareas	Computer Science; Engineering
dc.description.startpage	3697
dc.description.uri	http://dx.doi.org/10.1007/s11042-022-13397-8
dc.description.volume	82
dc.description.woscategory	Computer Science, Information Systems; Computer Science, Software Engineering; Computer Science, Theory & Methods; Engineering, Electrical & Electronic
dc.identifier.issn	1380-7501
dc.identifier.uri	https://acikarsiv.thk.edu.tr/handle/123456789/1159
dc.language.iso	English
dc.publisher	SPRINGER
dc.relation.journal	MULTIMEDIA TOOLS AND APPLICATIONS
dc.subject	Spam detection; Deep learning; Word embedding; Recurrent neural network; Embedding from language model
dc.subject	ACCOUNTS
dc.title	Spam detection on social networks using deep contextualized word representation
dc.type	Article
dspace.entity.type	Publication

Collections

WOS - Web of Science

TÜRK HAVA KURUMU

ÜNİVERSİTESİ

Publication:
Spam detection on social networks using deep contextualized word representation

Files

Collections

Publication: Spam detection on social networks using deep contextualized word representation

Files

Collections

Publication:
Spam detection on social networks using deep contextualized word representation