Text Compression Based on Letter’s Prefix in the Word

Majed AbuSafiya

doi:10.32604/cmc.2020.09282

Open Access icon Open Access

ARTICLE

Text Compression Based on Letter’s Prefix in the Word

Majed AbuSafiya^{1, *}

1 Al-Ahliyya Amman University, Amman, 19328, Jordan.

* Corresponding Author: Majed AbuSafiya. Email: email .

Computers, Materials & Continua 2020, 64(1), 17-30. https://doi.org/10.32604/cmc.2020.09282

Received 19 November 2019; Accepted 19 February 2020; Issue published 20 May 2020

Download PDF

Abstract

Huffman [Huffman (1952)] encoding is one of the most known compression algorithms. In its basic use, only one encoding is given for the same letter in text to compress. In this paper, a text compression algorithm that is based on Huffman encoding is proposed. Huffman encoding is used to give different encodings for the same letter depending on the prefix preceding it in the word. A deterministic finite automaton (DFA) that recognizes the words of the text is constructed. This DFA records the frequencies for letters that label the transitions. Every state will correspond to one of the prefixes of the words of the text. For every state, a different Huffman encoding is defined for the letters that label the transitions leaving that state. These Huffman encodings are then used to encode the letters of the words in the text. This algorithm was implemented and experimental study showed significant reduction in compression ratio over the basic Huffman encoding. However, more time is needed to construct these codes.

Keywords

Text compression, Huffman encoding, deterministic finite automata.

Cite This Article

APA Style

AbuSafiya, M. (2020). Text Compression Based on Letter’s Prefix in the Word. Computers, Materials & Continua, 64(1), 17–30. https://doi.org/10.32604/cmc.2020.09282

Vancouver Style

AbuSafiya M. Text Compression Based on Letter’s Prefix in the Word. Comput Mater Contin. 2020;64(1):17–30. https://doi.org/10.32604/cmc.2020.09282

IEEE Style

M. AbuSafiya, “Text Compression Based on Letter’s Prefix in the Word,” Comput. Mater. Contin., vol. 64, no. 1, pp. 17–30, 2020. https://doi.org/10.32604/cmc.2020.09282

BibTex EndNote RIS

Copyright © 2020 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Text Compression Based on Letter’s Prefix in the Word

Abstract

Keywords

Cite This Article

3535

2106

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link