Optimize Searching Using Latent Dirichlet Allocation
Keywords:
Bible verse, latent dirichlet allocation, recommendation system, topic modelingAbstract
This research focuses on topic modeling as a crucial method for exploring extensive document collections and uncovering latent topics within the data. Specifically, it highlights the Latent Dirichlet Allocation (LDA) algorithm from the perspective of natural language processing. The main objective is to gain an in-depth understanding of LDA algorithms and their implementation approaches. It covers dataset preparation, practical implementation of LDA, and exploring the potential benefits of integrating LDA into recommendation systems. A comprehensive analysis of LDA encompasses graphical representations, fundamental equations, optimization methodologies, and practical implementations, covering all procedural aspects. LDA analysis is performed on extracted verses from the King James version of the Bible as documents, revealing varying levels of topic associations across documents. Some documents show strong alignment with a single topic, while others have multiple topic assignments. The research findings reveal patterns and themes present in the data, providing valuable insights into the fundamental thematic composition of the analyzed documents. Additionally, the study contributes to existing knowledge by exploring the functionality, practical applications, and relevance of topic modeling and LDA in recommendation systems and natural language processing.
Downloads
References
C. Reed, "Latent Dirichlet Allocation: Towards a Deeper Understanding," in Proceedings of Colorado, 2012.
E. Atagun, B. Hartoka, and A. Albayrak, “Topic modeling using LDA and Bert Techniques: Teknofest example,” 2021 6th International Conference on Computer Science and Engineering (UBMK), 2021. doi:10.1109/ubmk52708.2021.9558988
M. G. Bean, "A Framework for Evaluating Recommender Systems," Master's thesis, Brigham Young University, 2016.
D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, vol. 3, no. Jan, pp. 993-1022, 2003.
J.A. Mathias, "Contextual Scripture Recommendation for Writers," Doctoral dissertation, 2019.
C. B. Asmussen and C. Møller, "Smart literature review: a practical topic modelling approach to exploratory literature review," Journal of Big Data, vol. 6, no. 1, 2019. doi: 10.1186/s40537-019-0255-7.
R. Alghamdi and K. Alfalqi, "A survey of topic modeling in text mining," Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 1, pp. 7, 2015. Available: https://doi.org/10.14569/IJACSA.2015.060121.
D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research 3 , vol. 3, pp. 993–1022, Mar. 2003.
V. K. Garbhapu and P. Bodapati, “A comparative analysis of latent semantic analysis and latent Dirichlet allocation topic modeling methods using Bible Data,” Indian Journal of Science and Technology, vol. 13, no. 44, pp. 4474–4482, 2020. doi:10.17485/ijst/v13i44.1479
K. Garbhapu and P. Bodapati, “Extractive summarization of Bible data using topic modeling,” International Journal of Engineering Trends and Technology, vol. 70, no. 6, pp. 79–89, 2022. doi:10.14445/22315381/ijett-v70i6p210
S. Basuki et al., “Detection of reference topics and suggestions using latent Dirichlet allocation (LDA),” 2019 12th International Conference on Information & Communication Technology and System (ICTS), 2019. doi:10.1109/icts.2019.8850993.
C.-I. Hsu and C. Chiu, “A hybrid latent Dirichlet allocation approach for topic classification,” 2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), 2017. doi:10.1109/inista.2017.8001177
R. Hossain, Md. R. Kabir Rasel Sarker, M. Mimo, A. Al Marouf, and B. Pandey, “Recommendation approach of English songs title based on latent Dirichlet allocation applied on lyrics,” 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT), 2019. doi:10.1109/icecct.2019.8869198.
Z. Wei, R. Zhao, Y. Wang, D. Miao, and W. Yuan, “A smoothed latent Dirichlet allocation model with application to business intelligence,” 2011 International Conference on Business Management and Electronic Information, 2011. doi:10.1109/icbmei.2011.5914426
Liao, B. (2017). Bible Verses Dataset. Kaggle. Retrieved from https://www.kaggle.com/datasets/phyred23/bibleverses?select=bible_data_set.csv
GeeksforGeeks, "Latent Dirichlet Allocation," [Online]. Available: https://media.geeksforgeeks.org/wp-content/uploads/20210509232957/latentdirichlet-660x439.JPG
N. Seth, “Part 2: Topic modeling and Latent Dirichlet allocation (LDA) using Gensim and Sklearn,” Analytics Vidhya, https://www.analyticsvidhya.com/blog/2021/06/part-2-topic-modeling-and-latent-dirichlet-allocation-lda-using-gensim-and-sklearn/ (accessed Jul. 18, 2023).
Aisha Ahmed, Machine Learning in Agriculture: Crop Yield Prediction and Disease Detection , Machine Learning Applications Conference Proceedings, Vol 2 2022.
Jhade, S. ., Kumar, V. S. ., Kuntavai, T. ., Shekhar Pandey, P. ., Sundaram, A. ., & Parasa, G. . (2023). An Energy Efficient and Cost Reduction based Hybridization Scheme for Mobile Ad-hoc Networks (MANET) over the Internet of Things (IoT). International Journal on Recent and Innovation Trends in Computing and Communication, 11(2s), 157–166. https://doi.org/10.17762/ijritcc.v11i2s.6038
Kumar, S. A. S., Naveen, R., Dhabliya, D., Shankar, B. M., Rajesh, B. N. Electronic currency note sterilizer machine (2020) Materials Today: Proceedings, 37 (Part 2), pp. 1442-1444.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.