Klasterisasi Keyword Terkait Pornografi pada Media Sosial Twitter Menggunakan Latent Dirichlet Allocation

نویسندگان

چکیده

Media sosial adalah saluran komunikasi bersifat online sebagai media yang digunakan untuk berbagi berbasis komunitas. sosialmemungkinkan seseorang terhubung satu dengan lain tanpa harus bertemu secara tatap muka. Twitter merupakan salah memberikan kebebasan bagi para penggunanya membuat, mengunggah, dan membaca unggahan disebut tweet jumlah pengguna di Indonesia mencapai 18,45 juta tahun 2022. ternyata ramai penyebarluasan konten asusila seperti pornografi. Pada penelitian ini, penulis mencari tahu beberapa kata kunci sering dalam penyebar luasan pornografi menggunakan metode Latent Dirichlet Allocation (LDA) menemukan topik dominan dari mengelompokkan otomatis. penerapan LDA, fitur stopword mengeliminasi kata-kata tidak diperlukan. Cara menentukan optimal yaitu melihat nilai perplexity coherence. Dari total data 15.135 didapatkan crawling, selanjutnya dipetakan menjadi lima paling banyak sange, nonton, bokepindo.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Latent Dirichlet Allocation based Recommender System for Twitter Followee Recommendation

Recommender System is a very important tool in today’s social networking and Ecommerce systems. Many existing technologies on recommender system depends on bagof-words based similarity methods to cluster the users on basis of which then it recommend new followees. These methods are not 100% correct due to many reasons one of them is incorrect topic identification. Latent Dirichlet Allocation (L...

متن کامل

Assignment 2: Twitter Topic Modeling with Latent Dirichlet Allocation Background

In this assignment we are going to implement a parallel MapReduce version of a popular topic modeling algorithm called Latent Dirchlet Allocation (LDA). Because it allows for exploring vast document collection, we are going to use this algorithm to see if we can automatically identify topics from a series of Tweets. For the purpose of this assignment, we are going to treat every tweet as a docu...

متن کامل

Spatial Latent Dirichlet Allocation

In recent years, the language model Latent Dirichlet Allocation (LDA), which clusters co-occurring words into topics, has been widely applied in the computer vision field. However, many of these applications have difficulty with modeling the spatial and temporal structure among visual words, since LDA assumes that a document is a “bag-of-words”. It is also critical to properly design “words” an...

متن کامل

Latent Dirichlet Allocation

We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hofmann's aspect model , also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: JIEET (Journal of Information Engineering and Educational Technology)

سال: 2022

ISSN: ['2549-869X']

DOI: https://doi.org/10.26740/jieet.v6n2.p66-72