A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks

Authors

  • Sharifi, Arash Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran.
Abstract:

The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in the English language, but more limited work has been done in the Persian language to detect rumors. This study analyzed the content of the original rumor and introduced informative content features to early identify Persian rumors (i.e., when it is published on news media but has not yet spread on social media) on Twitter and Telegram. Therefore, the proposed model is based on physical and non-physical content features in three categories including, lexical, syntactic, and pragmatic. These features are a combination of the common content features along with the proposed new content-based features. Since no social context information is available at the time of posting rumors, the proposed model is independent of propagation-based features and relies on the content-based information of the original rumor. Although in the proposed model, much information (including user information, the userchr('39')s reaction to the rumor, and propagation structures) are ignored, but helpful content information can be obtained for classification by content analysis of the original rumor. Several experiments have been performed on the various combinations of feature sets (i.e., common and proposed content features) to explore the capability of features in distinguishing rumors and non-rumors separately and jointly. To this end, three machine learning algorithms including, Random Forest (RF), AdaBoost, and Support Vector Machine (SVM) have been used as strong classifications to evaluate the accuracy of the proposed model. To achieve the best performance of classification algorithms on the training dataset, it is necessary to use feature selection techniques. In this study, the Sequential Forward Floating Search (SFFS) approach has been used to select valuable features. Also, the statistical results of the t-test on the P-value (<=0.05) demonstrate that most of the new features proposed in this study reveal statistically significant differences between rumor and non-rumor documents. The experimental results are shown the performance of new proposed features to improve the accuracy of the rumor detection. The F-measure of the proposed model to detect Persian rumors on the Twitter dataset was 0.848, on the Kermanshah earthquake dataset was 0.952 and on the Telegram dataset was 0.867, which indicated the ability of the proposed method to identify rumors only by focusing on the content features of the original rumor text. The results of evaluating the proposed model on Twitter rumors show that, despite the short length of Twitter tweets and the extraction of limited content information from tweets, the proposed model can detect Twitter rumors with acceptable accuracy. Hence, the ability of content features to distinguish rumors from non-rumors is proven.

Upgrade to premium to download articles

Sign up to access the full text

Already have an account?login

similar resources

the washback effect of discretepoint vs. integrative tests on the retention of content in knowledge tests

در این پایان نامه تاثیر دو نوع تست جزیی نگر و کلی نگر بر به یادسپاری محتوا ارزیابی شده که نتایج نشان دهندهکارایی تستهای کلی نگر بیشتر از سایر آزمونها است

15 صفحه اول

content analysis of l1 and l2 writing textbooks and the perception analysis of university professors on the influence of english writing expertise on the persian in an efl context

the importance of writing as a complex skill in applied linguistics has drawn the attention of many researchers to evaluate textbooks in order to help learners gain self-sufficiency and autonomy in the field of language use and communication. investigations have shown that developments in textbooks evaluation can promote the quality of pedagogies and consequently the learning. this study attemp...

investigating the feasibility of a proposed model for geometric design of deployable arch structures

deployable scissor type structures are composed of the so-called scissor-like elements (sles), which are connected to each other at an intermediate point through a pivotal connection and allow them to be folded into a compact bundle for storage or transport. several sles are connected to each other in order to form units with regular polygonal plan views. the sides and radii of the polygons are...

the analysis of the role of the speech acts theory in translating and dubbing hollywood films

از محوری ترین اثراتی که یک فیلم سینمایی ایجاد می کند دیالوگ هایی است که هنرپیش گان فیلم میگویند. به زعم یک فیلم ساز, یک شیوه متأثر نمودن مخاطب از اثر منظوره نیروی گفتارهای گوینده, مثل نیروی عاطفی, ترس آور, غم انگیز, هیجان انگیز و غیره, است. این مطالعه به بررسی این مسأله مبادرت کرده است که آیا نیروی فراگفتاری هنرپیش گان به مثابه ی اعمال گفتاری در پنج فیلم هالیوودی در نسخه های دوبله شده باز تولید...

15 صفحه اول

the use of appropriate madm model for ranking the vendors of mci equipments using fuzzy approach

abstract nowadays, the science of decision making has been paid to more attention due to the complexity of the problems of suppliers selection. as known, one of the efficient tools in economic and human resources development is the extension of communication networks in developing countries. so, the proper selection of suppliers of tc equipments is of concern very much. in this study, a ...

15 صفحه اول

study of cohesive devices in the textbook of english for the students of apsychology by rastegarpour

this study investigates the cohesive devices used in the textbook of english for the students of psychology. the research questions and hypotheses in the present study are based on what frequency and distribution of grammatical and lexical cohesive devices are. then, to answer the questions all grammatical and lexical cohesive devices in reading comprehension passages from 6 units of 21units th...

My Resources

Save resource for easier access later

Save to my library Already added to my library

{@ msg_add @}


Journal title

volume 18  issue 1

pages  50- 29

publication date 2021-05

By following a journal you will be notified via email when a new issue of this journal is published.

Keywords

No Keywords

Hosted on Doprax cloud platform doprax.com

copyright © 2015-2023