K-RMS Based Text Summarization Technique
Keywords:
Text summarization, K-means algorithm and K-RMS clustering algorithm.Abstract
Clustering is an unsupervised process of grouping the similar types of data and plays a significant role in the fields of machine learning and data science. K-means is one of such clustering algorithm and it has various application fields. One of these application areas is text mining and many researchers have used K-means for text summarization purpose successfully. In the mean time, it is found that K-means algorithm has some noticeable drawbacks namely low efficiency and more iteration while dealing with the large dataset. To overcome these issues, a modified K-means has been illustrated by Garain et al.[1] called as K-RMS clustering algorithm. In the present work, K-RMS algorithm has been applied for text summarization. The K-RMS algorithm has been tested to OpinosisDataset1.0 dataset and compares the result with K-means clustering algorithm and found the noticeable result.
Downloads
References
Avishek Garain , Dipankar Das,” K-RMS Algorithm”, International Conference on Computational Intelligence and Data Science (ICCIDS 2019).
Wan, X. 2008. Using only cross-document relationships for both generic and topic-focused multi-document summarizations. Information Retrieval.
Hamzah Noori Fejer and Nazlia Omar,” Automatic Multi-Document Arabic Text Summarization Using Clustering and Keyphrase Extraction”, Journal of Artificial Intelligence, 2015 ISSN 1994-5450 / DOI: 10.3923/jai.2015.
Steinhaus, Hugo (1957). "Sur la division des corps matériels en parties". Bull. Acad. Polon. Sci. (in French). 4 (12): 801–804. MR 0090073. Zbl 0079.16403.
MacQueen, J. B. (1967). Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1. University of California Press. pp. 281–297. MR 0214227. Zbl 0214.46201. Retrieved 2009-04-07.
Lloyd, Stuart P. (1957). "Least square quantization in PCM". Bell Telephone Laboratories Paper. Published in journal much later: Lloyd, Stuart P. (1982). "Least squares quantization in PCM" (PDF). IEEE Transactions on Information Theory. 28 (2): 129–137. CiteSeerX 10.1.1.131.1338. doi:10.1109/TIT.1982.1056489. S2CID 10833328. Retrieved 2009-04-15.
Forgy, Edward W. (1965). "Cluster analysis of multivariate data: efficiency versus interpretability of classifications". Biometrics. 21 (3): 768–769. JSTOR 2528559.
Kanungo, T., D.M. Mount, N.S. Netanyahu, C.D. Piatko, R.S. Angela and Y. Wu, 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell., 24: 881-892.
Jain, A.K. and R.C. Dubes, 1988. Algorithms for Clustering Data. Prentice Hall Inc., Englewood Cliffs, USA., ISBN: 0-13-022278-X, Pages: 320.
Cimiano, P., A. Hotho and S. Staab, 2005. Learning concept hierarchies from text corpora using formal concept analysis. J. Artif. Intell. Res., 24: 305-339.
Hartigan, J.A., 1975. Clustering Algorithms. Books on Demand, New York, USA., ISBN-13: 9780608300498, Pages: 365.
MacQueen, J., 1967. Some methods for classification and analysis of multivariate observations, in: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Statistics, University of California Press, Berkeley, Calif.. pp. 281–297. URL: https://projecteuclid.org/euclid.bsmsp/1200512992.
Vora, P. and B. Oza, 2013. A survey on k-mean clustering and particle swarm optimization. Int. J. Sci. Mod. Eng., 1: 24-26.
ShrabantiMandal, Anita Pal “Website Search Technique Using K-Means Algorithm”GESJ (International Journal): Computer Science and Telecommunications. (ISSN 1512-1232), Vol.3 No.39, pp. 112-117, 2013, USA.
Onajite, E. (Ed.), Seismic Data Analysis Techniques in Hydrocarbon Exploration. Elsevier, Oxford, URL: http://www.sciencedirect.com/science/article/pii/B9780124200234099949, doi:https://doi.org/10.1016/ B978-0-12-420023-4.09994-9, 2014.
Purcaru, D., Purcaru, I., Niculescu, E., 2006. Some methods for computing RMS values and phase differences of currents and voltages, in: Proceedings of the 9th WSEAS International Conference on Applied Mathematics (MATH06), Turkey, pp. 587–591.
Shareghi E and Hassanabadi L S, Text summarization with harmony search algorithm-based sentence extraction. In: Proceedings of the 5th International Conference on Soft Computing as Trans disciplinary Science and Technology, ACM, 2008; 226–231.
Parveen D, Mesgar M and Strube M, Generating coherent summaries of scientific articles using coherence patterns. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016; 772–783.
Sankar K and Sobha L, An approach to text summarization. In: Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies, ACL. 2009; 53–60.
Verma P and Om H, MCRMR: Maximum coverage and relevancy with minimal redundancy based multi-document summarization. Expert Systems with Applications. 2019;120: 43–56.
Ansamma J, Premjith P S and Wilscy M, Extractive multi-document summarization using population-based multicriteria optimization. Expert Systems with Applications.2017; 86: 385–397.
Lin CY, Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out.2004.
https://archive.ics.uci.edu/ml/datasets/Opinosis+Opinion+%26frasl%3B+Review.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.