Data Structures Huffman's Algorithm

Huffman's algorithm generates a variable-length encoding a given alphabet for the purpose of data compression

Huffman's algorithm generates a variable-length encoding a given alphabet for the purpose of _________ _________

Widely used today as part of various ___________ utilities

Widely used today as part of various compression utilities

The number of bits per character determined by the char's relative frequency of occurnce

...

Most frequently occurring characters should use the fewest bits

ASCII image

A bad attempt at approaching compression

We can use a code tree, which are binary trees in which the leaves contain the characters to be coded

...

In a code tree interior nodes are just place-holders

The root of every subtree is annotated with the cumulative frequency of all its descendent leaves

Character codes are generated by root to leaf traversals

We would like to use the code tree with the minimum expected code length. This means the tree should be relatively flat.

...

The expected code length is a weight average of all possible character code lengths

...

Huffman's algorithm generates a code tree with an expected code length that is at least as small as any other code tree that could be generated

Huffman's algorithm generates a code tree with an expected code length that is at least as small as any other code tree that could be generated

Huffman's algorithm generates a variable length code with the prefix property such that there is no other encoding with a smaller expected code length

We create a single node code tree for each character and insert each of these trees in a priority queue (min heap)

Example image 1

Example image 2

Example to try