Computer Science GCES AQA
-
Representing Algorithms Aqa4 主题
-
Efficiency Of Algorithms Aqa1 主题
-
Searching Algorithms Aqa3 主题
-
Sorting Algorithms Aqa3 主题
-
Data Types Aqa1 主题
-
Programming Concepts Aqa5 主题
-
Arithmetic Relational And Boolean Operations Aqa1 主题
-
Data Structures Aqa3 主题
-
String Manipulation Aqa1 主题
-
Random Number Generation Aqa1 主题
-
Structured Programming Aqa2 主题
-
Robust And Secure Programming Aqa4 主题
-
Number Bases Aqa2 主题
-
Converting Between Number Bases Aqa3 主题
-
Units Of Information Aqa9 主题
-
Hardware And Software Aqa4 主题
-
Boolean Logic Aqa3 主题
-
Programming Languages And Translators Aqa2 主题
-
Cpu Architecture Performance And Embedded Systems Aqa4 主题
-
Memory Aqa2 主题
-
Secondary Storage Aqa3 主题
-
Fundamentals Of Computer Networks Aqa8 主题
-
Fundamentals Of Cyber Security Aqa1 主题
-
Methods Of Preventing Cyber Security Threats Aqa1 主题
-
Relational Databases Aqa2 主题
-
Ethical Legal And Environmental Impacts Aqa2 主题
Compression Huffman Coding Aqa
Exam code:8525
Huffman Coding
What is Huffman coding?
-
Huffman coding is a method of lossless compression primarily used on text based data (documents)
-
A Huffman coding tree is used to compress the data whilst keeping all the data so that it can be uncompressed back to its original state
Example (step 1)
-
A text file contains the following characters
|
Text file |
|||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
I |
|
L |
O |
V |
E |
|
C |
O |
M |
P |
U |
T |
E |
R |
S |
-
The text file contain 16 characters including spaces
-
A frequency analysis shows there is some repetition in the characters, for example there are 2 x E’s
|
Frequency |
||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
I |
space |
L |
O |
V |
E |
C |
M |
P |
U |
T |
R |
S |
|
1 |
2 |
1 |
2 |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
Huffman Trees
What is a Huffman tree?
-
A Huffman tree is a type of binary tree used in Huffman coding to achieve lossless compression of text data
-
Each character is represented as a leaf in the tree
-
Internal nodes always have two children, while leaves have none
-
The tree is built based on the frequency of characters in the text, so that more frequent characters are given shorter codes
-
Binary trees are covered in much more detail at A-Level
Example (step 2)
-
List the frequency of each character and order them from least to most frequent:

-
Combine the two least frequent characters (I + L = 2) into a new node, then reorder:

-
Update the frequency data to show the combined characters

-
Repeat until all characters are combined into a single tree

-
Assigning bit patterns
-
Follow each branch of the tree, writing 0 on left branches and 1 on right branches
-
Each character ends up with a unique binary code
-

-
More frequent characters get shorter codes
|
Character |
Bit pattern |
Times used |
Total bits |
|---|---|---|---|
|
space |
|
2 |
3*2 = 6 |
|
O |
|
2 |
3*2 = 6 |
|
E |
|
2 |
3*2 = 6 |
|
I |
|
1 |
4*1 = 4 |
|
L |
|
1 |
4*1 = 4 |
|
V |
|
1 |
4*1 = 4 |
|
C |
|
1 |
<p |
Responses