An Optimized Integer Representation through a Novel Numeric Encoding for Textual Data Compression
Keywords:
Burrows-Wheeler Transform, Elias Delta Code, Elias Gamma Code, Golomb Code, Numeric EncodingAbstract
The objective of this paper is to introduce a new variable sized integer encoding technique for file compression. The paper aims to compare the performance of the proposed method with established codes like Elias Gamma, Elias Delta, and Golomb. The study also seeks to examine the impact of varying log base values on compression ratio and runtime efficiency. The proposed method utilizes radix conversion and the Burrows Wheeler Transform for file compression. Performance comparison is conducted on the Calgary corpus, which includes both text and binary files. Existing codes like Elias Gamma, Elias Delta, and Golomb are executed on the files before evaluating the proposed code. Graphs are used to analyze the impact of log base values on compression ratio, while runtime efficiency is assessed. The proposed compression code achieves varied compression ratios (1.67 to 1.87) at radix r=4, highlighting its effectiveness over existing algorithms. A non-linear relationship between the log base and compression ratio is observed, plateauing as the log base increases. Runtime varies among files, with 'bib1' at the longest time (6.41 seconds) and 'obj1' the shortest (0.09 seconds). A positive correlation exists between the number of data points (n) and runtime, while a negative correlation is seen between 'n' and compression ratio, indicating lower ratios for larger 'n' files. Comparing its performance with established codes provides a benchmark for evaluation. Analyzing compression ratio trends and runtime efficiency offers insights into the effectiveness of the proposed method, adding to its novelty.
Downloads
References
Uthayakumar Jayasankar, Vengattaraman Thirumal, Dhavachelvan Ponnurangam, A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications, Journal of King Saud University - Computer and Information Sciences, Volume 33, Issue 2, 2021, Pages 119-140, ISSN 1319-1578, https://doi.org/10.1016/j.jksuci.2018.05.006.
Tania Banerjee, Jong Choi, Jaemoon Lee, Qian Gong, Jieyang Chen, Scott Klasky, Anand Rangarajan, Sanjay Ranka: “Scalable Hybrid Learning Techniques for Scientific Data Compression”, 2022. http://arxiv.org/abs/2212.10733 arXiv:2212.10733.
Elakkiya, S., Thivya, K.S. Comprehensive Review on Lossy and Lossless Compression Techniques. J. Inst. Eng. India Ser. B 103, 1003–1012 (2022). https://doi.org/10.1007/s40031-021-00686-3.
A. Gopinath and M. Ravisankar, "Comparison of Lossless Data Compression Techniques," 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 2020, pp. 628-633, doi: 10.1109/ICICT48043.2020.9112516.
Congero, Spencer, and Kenneth Zeger. Competitive Advantage of Huffman and Shannon-Fano Codes. 2023.https://ar5iv.labs.arxiv.org/html/2311.07009.
Rowley, Jamie. “Run-Length Encoding in Data Compression.” Endless Compression, 28 Nov. 2022, www.endlesscompression.com/encoding-data-compression/. Accessed 20 Feb. 2024.
Addepalli, Phani & Lakshmi, P.V.. (2021). An Efficient Lossless Medical Data Compression using LZW compressionfor OptimalCloud Data Storage. 25. 17144-17160. https://www.researchgate.net/publication/353514407.
Kumari, B., Kamal, N.K., Sattar, A.M., & Ranjan, M.K. (2023). Adaptive Huffman Algorithm for Data Compression Using Text Clustering and Multiple Character Modification. RECENT TRENDS IN PROGRAMMING LANGUAGES. DOI:10.37591/rtpl.v10i1.509.
Anis Suliman Ali Bakouri, "TIFF Image Compression through Huffman Coding Technique", International Journal of Science and Research (IJSR), Volume 11 Issue 10, October 2022, pp. 277-279, https://www.ijsr.net/getabstract.php?paperid=SR22929233828.
Virendra Nikam, Sheetal Dhande. (2023). A Historical Perspective on Approaches to Data Compression. Mathematics and Computer Science, 8(3), 68-72. https://doi.org/10.11648/j.mcs.20230803.11.
Manikandan VM, Murthy KSR, Siddineni B, Victor N, Maddikunta PKR, Hakak S. A High-Capacity Reversible Data-Hiding Scheme for Medical Image Transmission Using Modified Elias Gamma Encoding. Electronics. 2022; 11(19):3101. https://doi.org/10.3390/electronics11193101.
Fante, Kinde & Bhaumik, Basabi. (2022). Low-Power Endoscopic Image Compression Algorithms Using Modified Golomb Codes. 10.1007/978-981-16-2123-9_5.
Rahman, Md. (2020). Burrows–Wheeler Transform Based Lossless Text Compression Using Keys and Huffman Coding. Symmetry. 12. 10.3390/sym12101654.
Nelson Raja, J., Jaganathan, P., & Domnic, S. (2015). A New Variable-Length Integer Code for Integer Representation and Its Application to Text Compression. In Indian Journal of Science and Technology (Vol. 8, Issue 24). Indian Society for Education and Environment. https://doi.org/10.17485/ijst/2015/v8i24/80242.
Hariska, Elvia & Yuliani, Ega & Nasution, Surya. (2021). Performance Comparison Analysis of the Elias Delta Code Algorithm with the Even Rodeh Code Algorithm for Compressing Image Files. The IJICS (International Journal of Informatics and Computer Science). 5. 29. 10.30865/ijics.v5i1.2888.
S. Kalaivani, C. Tharini, Analysis and implementation of novel Rice Golomb coding algorithm for wireless sensor networks, Computer Communications, Volume 150, 2020, Pages 463-471, ISSN 0140-3664, https://doi.org/10.1016/j.comcom.2019.11.046.
Hassan N. Noura, Joseph Azar, Ola Salman, Raphaël Couturier, and Kamel Mazouzi. 2023. A deep learning scheme for efficient multimedia IoT data compression. Ad Hoc Netw. 138, C (Jan 2023). https://doi.org/10.1016/j.adhoc.2022.102998.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Kanak Pandit, Harshali Patil, Poonam Joshi,Tarunima Mukherjee
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.