Komparasi Algoritma Topic Modelling LDA VS LSA Pada Berita Detikcom

Authors

  • Ahmad Kemal Al Izzi Universitas Islam Negeri Sunan Ampel Surabaya
  • Rakadian Audiga Pratama Universitas Islam Negeri Sunan Ampel Surabaya

DOI:

https://doi.org/10.22441/format.2024.v13.i1.005

Keywords:

Topic Modelling, Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), Detikcom, Topic Coherence

Abstract

This research focuses on the process of applying Topic Modeling by comparing the Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) models on news tweet data taken from the Detikcom account. The process begins by crawling data over a one year period, starting from December 9, 2022 to December 9, 2023, resulting in 958 rows of data. Data pre-processing includes steps such as case folding, tokenization, stopwords removal, and stemming. After pre-processing, a bag of words process is carried out to calculate the frequency of word occurrences in each document. The number of word occurrence frequencies is used as a reference in creating LSA and LDA models. Each model has 8 topics, 10 iterations, and 42 random states. Topic production is carried out based on keywords that appear in the modeling results. Evaluation of the two models is carried out by measuring topic coherence or topic coherence using the c_v value. The LSA model shows a coherence value of 0.5, while the LDA model has a coherence value of 0.45. The evaluation results show that in this case, the LSA model has better performance than the LDA model based on the topic coherence value. As a suggestion for further research, researchers are expected to consider the use of other cases for topic modeling and other exploration models in Topic Modeling such as OCTIS. This can expand understanding of the performance of the Topic Modeling algorithm on X news data.

Downloads

Download data is not yet available.

References

A. N. Ulfah and M. K. Anam, “Analisis Sentimen Hate Speech Pada Portal Berita Online Menggunakan Support Vector Machine (SVM),” vol. 7, no. 1, pp. 1–10, 2020, [Online]. Available: http://jurnal.mdp.ac.id

C. Naury, D. H. Fudholi, and A. F. Hidayatullah, “Topic Modelling pada Sentimen Terhadap Headline Berita Online Berbahasa Indonesia Menggunakan LDA dan LSTM,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 5, no. 1, p. 24, Jan. 2021, doi: 10.30865/mib.v5i1.2556.

D. M. Wonohadidjojo, “Perbandingan Convolutional Neural Network pada Transfer Learning Method untuk Mengklasifikasikan Sel Darah Putih,” Ultimatics : Jurnal Teknik Informatika, vol. 13, no. 1, p. 51, 2021.

A. P. Giovani, A. Ardiansyah, T. Haryanti, L. Kurniawati, and W. Gata, “ANALISIS SENTIMEN APLIKASI RUANG GURU DI TWITTER MENGGUNAKAN ALGORITMA KLASIFIKASI,” Jurnal Teknoinfo, vol. 14, no. 2, p. 115, Jul. 2020, doi: 10.33365/jti.v14i2.679.

J. Budiarto, “Identifikasi Kebutuhan Masyarakat Nusa Tenggara Barat pada Pandemi Covid-19 di Media Sosial dengan Metode Crawling (Requirements Identification for NTB People in pandemic covid-19 at Social Media Using Crawling Method),” vol. 2, no. 4, pp. 244–250, 2021.

I. N. Husada, E. H. Fernando, H. Sagala, A. E. Budiman, and H. Toba, “Ekstraksi dan Analisis Produk di Marketplace Secara Otomatis dengan Memanfaatkan Teknologi Web Crawling,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 5, no. 3, Jan. 2020, doi: 10.28932/jutisi.v5i3.1977.

M. Dwirizqy Wimbassa, T. Marsyah Noor, S. Yasara, and T. Muhammad Arsyah, “Emotional Text Detection dengan Long Short Term Memory (LSTM),” Jurnal Format, vol. 12, 2023.

B. Gunawan, H. P. Sasty, and E. P. Esyudha, “Sistem Analisis Sentimen pada Ulasan Produk Menggunakan Metode Naive Bayes,” JEPIN (Jurnal Edukasi dan Penelitian Informatika), vol. 4, no. 2, pp. 17–29, 2018, [Online]. Available: www.femaledaily.com

Samsir, Ambiyar, U. Verawardina, F. Edi, and R. Watrianthos, “Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 5, no. 1, pp. 157–163, Jan. 2021, doi: 10.30865/mib.v5i1.2604.

D. Alita and A. Rahman, “Pendeteksian Sarkasme pada Proses Analisis Sentimen Menggunakan Random Forest Classifier,” 2020.

M. Fiqri and R. Setya Perdana, “Klasifikasi Data Twitter pada Masa Transisi Pandemi menuju Endemi menggunakan Latent Semantic Analysis (LSA),” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 7, no. 6, pp. 2736–2742, 2023, [Online]. Available: http://j-ptiik.ub.ac.id

N. Hendrastuty, A. Rahman Isnain, and A. Yanti Rahmadhani, “Analisis Sentimen Masyarakat Terhadap Program Kartu Prakerja Pada Twitter Dengan Metode Support Vector Machine,” Jurnal Informatika: Jurnal pengembangan IT (JPIT), vol. 6, no. 3, 2021, [Online]. Available: http://situs.com

R. Farhan, R. Pohan, D. E. Ratnawati, and I. Arwani, “Implementasi Algoritma Support Vector Machine dan Model Bag-of-Words dalam Analisis Sentimen mengenai PILKADA 2020 pada Pengguna Twitter,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 6, no. 10, pp. 4924–4931, 2022, [Online]. Available: http://j-ptiik.ub.ac.id

I. Noor Kabiru and P. Kencana Sari, “ANALISA KONTEN MEDIA SOSIAL E-COMMERCE PADA INSTAGRAM MENGGUNAKAN METODE SENTIMEN ANALYSIS DAN LDA-BASED TOPIC MODELING (STUDI KASUS: SHOPEE INDONESIA) ANALYSIS OF CONTENT SOCIAL MEDIA E-COMMERCE IN INSTAGRAM USING SENTIMENT ANALYSIS AND LDA BASED TOPIC MODELING (STUDY CASE : SHOPEE INDONESIA),” e-Proceeding of Management, vol. 6, no. 1, p. 12, 2019.

M. H. Ababil and G. J. B. Setiawan, “Topic Modelling pada Ulasan Game Online Wildrift Menggunakan Latent Dirichlet Allocation (LDA),” Jurnal Pendidikan dan Konseling, vol. 4, no. 6, 2022.

F. Rashif, G. Ihza Perwira Nirvana, M. Alif Noor, and N. Aini Rakhmawati, “Implementasi LDA untuk Pengelompokan Topik Cuitan Akun Bot Twitter bertagar #Covid-19 LDA Implementation for Topic of Bot’s Tweets with #Covid-19 Hashtag,” Cogito Smart Journal |, vol. 7, no. 1, 2021.

K. Rinartha, L. Gede, and S. Kartika, “Penerapan LSA dan Query Suggestion untuk Pencarian Judul Artikel Menggunakan Framework FLASK LSA and Query Suggestion for Article Searching with FLASK Framework,” Cogito Smart Journal |, vol. 8, no. 1, 2022.

H. Jayadianti, R. Damayanti, and Juwairiah, “LATENT SEMANTIC ANALYSIS (LSA) DAN AUTOMATIC TEXT SUMMARIZATION (ATS) DALAM OPTIMASI PENCARIAN ARTIKEL COVID 19,” in Seminar Nasional Informatika 2020 (SEMNASIF 2020), 2020.

E. H. Fernando and H. Toba, “Pemanfaatan Latent Semantic Indexing untuk Mengukur Potensi Kerjasama Jurnal Ilmiah Lintas Universitas,” Jurnal Teknik Informatika dan Sistem Informasi, vol. 6, no. 3, Dec. 2020, doi: 10.28932/jutisi.v6i3.2894.

Dinda Adimanggala, Fitra Abdurrachman Bachtiar, and Eko Setiawan, “Evaluasi Topik Tersembunyi Berdasarkan Aspect Extraction menggunakan Pengembangan Latent Dirichlet Allocation,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 3, pp. 511–519, Jun. 2021, doi: 10.29207/resti.v5i3.3075.

S. Kasau, S. Syarif, and S. Handayani Makassar, “TEXT MINING IN TWITTER: AN ANALYSIS AND MONITORING POLITICAL ISSUES,” semanTIK, vol. 7, no. 1, pp. 1–5, 2021, doi: 10.5281/zenodo.5036154.

Downloads

Published

2024-11-07

How to Cite

[1]
A. K. Al Izzi and R. A. Pratama, “Komparasi Algoritma Topic Modelling LDA VS LSA Pada Berita Detikcom”, FORMAT, vol. 13, no. 1, pp. 44–54, Nov. 2024.

Issue

Section

Articles

Similar Articles

> >> 

You may also start an advanced similarity search for this article.