Table of Content

Open Access iconOpen Access

ARTICLE

crossmark

Text Compression Based on Letter’s Prefix in the Word

Majed AbuSafiya1, *

1 Al-Ahliyya Amman University, Amman, 19328, Jordan.

* Corresponding Author: Majed AbuSafiya. Email: email.

Computers, Materials & Continua 2020, 64(1), 17-30. https://doi.org/10.32604/cmc.2020.09282

Abstract

Huffman [Huffman (1952)] encoding is one of the most known compression algorithms. In its basic use, only one encoding is given for the same letter in text to compress. In this paper, a text compression algorithm that is based on Huffman encoding is proposed. Huffman encoding is used to give different encodings for the same letter depending on the prefix preceding it in the word. A deterministic finite automaton (DFA) that recognizes the words of the text is constructed. This DFA records the frequencies for letters that label the transitions. Every state will correspond to one of the prefixes of the words of the text. For every state, a different Huffman encoding is defined for the letters that label the transitions leaving that state. These Huffman encodings are then used to encode the letters of the words in the text. This algorithm was implemented and experimental study showed significant reduction in compression ratio over the basic Huffman encoding. However, more time is needed to construct these codes.

Keywords


Cite This Article

M. AbuSafiya, "Text compression based on letter’s prefix in the word," Computers, Materials & Continua, vol. 64, no.1, pp. 17–30, 2020. https://doi.org/10.32604/cmc.2020.09282



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2666

    View

  • 1355

    Download

  • 0

    Like

Share Link