Optimize Searching Using Latent Dirichlet Allocation

Authors

  • Mary Ann E. Ignaco Graduate Programs, Technological Institute of the Philippines, Manila, Philippines.
  • Melvin A. Ballera Graduate Programs, Technological Institute of the Philippines, Manila, Philippines.

Keywords:

Bible verse, latent dirichlet allocation, recommendation system, topic modeling

Abstract

This research focuses on topic modeling as a crucial method for exploring extensive document collections and uncovering latent topics within the data. Specifically, it highlights the Latent Dirichlet Allocation (LDA) algorithm from the perspective of natural language processing. The main objective is to gain an in-depth understanding of LDA algorithms and their implementation approaches. It covers dataset preparation, practical implementation of LDA, and exploring the potential benefits of integrating LDA into recommendation systems. A comprehensive analysis of LDA encompasses graphical representations, fundamental equations, optimization methodologies, and practical implementations, covering all procedural aspects. LDA analysis is performed on extracted verses from the King James version of the Bible as documents, revealing varying levels of topic associations across documents. Some documents show strong alignment with a single topic, while others have multiple topic assignments. The research findings reveal patterns and themes present in the data, providing valuable insights into the fundamental thematic composition of the analyzed documents. Additionally, the study contributes to existing knowledge by exploring the functionality, practical applications, and relevance of topic modeling and LDA in recommendation systems and natural language processing.

Downloads

Download data is not yet available.

References

C. Reed, "Latent Dirichlet Allocation: Towards a Deeper Understanding," in Proceedings of Colorado, 2012.

E. Atagun, B. Hartoka, and A. Albayrak, “Topic modeling using LDA and Bert Techniques: Teknofest example,” 2021 6th International Conference on Computer Science and Engineering (UBMK), 2021. doi:10.1109/ubmk52708.2021.9558988

M. G. Bean, "A Framework for Evaluating Recommender Systems," Master's thesis, Brigham Young University, 2016.

D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, vol. 3, no. Jan, pp. 993-1022, 2003.

J.A. Mathias, "Contextual Scripture Recommendation for Writers," Doctoral dissertation, 2019.

C. B. Asmussen and C. Møller, "Smart literature review: a practical topic modelling approach to exploratory literature review," Journal of Big Data, vol. 6, no. 1, 2019. doi: 10.1186/s40537-019-0255-7.

R. Alghamdi and K. Alfalqi, "A survey of topic modeling in text mining," Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 1, pp. 7, 2015. Available: https://doi.org/10.14569/IJACSA.2015.060121.

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research 3 , vol. 3, pp. 993–1022, Mar. 2003.

V. K. Garbhapu and P. Bodapati, “A comparative analysis of latent semantic analysis and latent Dirichlet allocation topic modeling methods using Bible Data,” Indian Journal of Science and Technology, vol. 13, no. 44, pp. 4474–4482, 2020. doi:10.17485/ijst/v13i44.1479

K. Garbhapu and P. Bodapati, “Extractive summarization of Bible data using topic modeling,” International Journal of Engineering Trends and Technology, vol. 70, no. 6, pp. 79–89, 2022. doi:10.14445/22315381/ijett-v70i6p210

S. Basuki et al., “Detection of reference topics and suggestions using latent Dirichlet allocation (LDA),” 2019 12th International Conference on Information & Communication Technology and System (ICTS), 2019. doi:10.1109/icts.2019.8850993.

C.-I. Hsu and C. Chiu, “A hybrid latent Dirichlet allocation approach for topic classification,” 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), 2017. doi:10.1109/inista.2017.8001177

R. Hossain, Md. R. Kabir Rasel Sarker, M. Mimo, A. Al Marouf, and B. Pandey, “Recommendation approach of English songs title based on latent Dirichlet allocation applied on lyrics,” 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), 2019. doi:10.1109/icecct.2019.8869198.

Z. Wei, R. Zhao, Y. Wang, D. Miao, and W. Yuan, “A smoothed latent Dirichlet allocation model with application to business intelligence,” 2011 International Conference on Business Management and Electronic Information, 2011. doi:10.1109/icbmei.2011.5914426

Liao, B. (2017). Bible Verses Dataset. Kaggle. Retrieved from https://www.kaggle.com/datasets/phyred23/bibleverses?select=bible_data_set.csv

GeeksforGeeks, "Latent Dirichlet Allocation," [Online]. Available: https://media.geeksforgeeks.org/wp-content/uploads/20210509232957/latentdirichlet-660x439.JPG

N. Seth, “Part 2: Topic modeling and Latent Dirichlet allocation (LDA) using Gensim and Sklearn,” Analytics Vidhya, https://www.analyticsvidhya.com/blog/2021/06/part-2-topic-modeling-and-latent-dirichlet-allocation-lda-using-gensim-and-sklearn/ (accessed Jul. 18, 2023).

Aisha Ahmed, Machine Learning in Agriculture: Crop Yield Prediction and Disease Detection , Machine Learning Applications Conference Proceedings, Vol 2 2022.

Jhade, S. ., Kumar, V. S. ., Kuntavai, T. ., Shekhar Pandey, P. ., Sundaram, A. ., & Parasa, G. . (2023). An Energy Efficient and Cost Reduction based Hybridization Scheme for Mobile Ad-hoc Networks (MANET) over the Internet of Things (IoT). International Journal on Recent and Innovation Trends in Computing and Communication, 11(2s), 157–166. https://doi.org/10.17762/ijritcc.v11i2s.6038

Kumar, S. A. S., Naveen, R., Dhabliya, D., Shankar, B. M., Rajesh, B. N. Electronic currency note sterilizer machine (2020) Materials Today: Proceedings, 37 (Part 2), pp. 1442-1444.

Downloads

Published

04.11.2023

How to Cite

E. Ignaco, M. A. ., & Ballera, M. A. . (2023). Optimize Searching Using Latent Dirichlet Allocation. International Journal of Intelligent Systems and Applications in Engineering, 12(3s), 161–166. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/3694

Issue

Section

Research Article