Researchers C. Oswald and B. Sivaselvan have published a paper exploring a set of text and image algorithms that generate anywhere from 18% to 751% improvement in compression ratio.
Oswald and Sivaselvan’s research can be read in the Data Science Journal. Here is an excerpt:
We propose a novel and efficient text compression approach by making the compression of any word level text in a universal manner for corpora across domains which is the Universal Huffman Tree based encoding. The major contribution of the work is in terms of avoidance of code table communication to the decoding phase. Using Universal Huffman Tree we can compress any text without building a new tree every time for each input and this reduces the code table size to a great extent.
The advantages include one can hard code the tree in the decoder software and the need to communicate the tree to the decoder along with the encoded text can thus be avoided. It reduces the space constraints on the tree and one can concentrate on optimizing the encoding length rather than code table size.