update
This commit is contained in:
@@ -17,7 +17,8 @@
|
||||
\usepackage{tikz}
|
||||
\usepackage{pgfplots}
|
||||
\usetikzlibrary{positioning}
|
||||
%\usegdlibrary{trees}
|
||||
\usetikzlibrary{trees}
|
||||
%\usetikzlibrary{graphs, graphdrawing}
|
||||
%%% math
|
||||
\usepackage{amsmath}
|
||||
%%% citations
|
||||
@@ -41,6 +42,7 @@ In coding theory, the events of an information source are to be encoded in a man
|
||||
the information provided by the source.
|
||||
The process of encoding can thus be described by a function $C$ transforming from a source alphabet $X$ to a code alphabet $Y$.
|
||||
Symbols in the alphabets are denominated $x_i$ and $y_j$ respectively, and have underlying probabilities $p_{i}$.
|
||||
% TODO fix use of alphabet / symbol / code word: alphabet is usually binary -> code word is 010101
|
||||
\begin{equation}
|
||||
C: X \rightarrow Y \qquad X=\{x_1,x_2,...x_n\} \qquad Y=\{y_1,y_2,...y_m\}
|
||||
\label{eq:formal-code}
|
||||
@@ -96,24 +98,33 @@ In the case of the capital code in fact every word other than the longest possib
|
||||
lower in the table. As a result, the receiver cannot instantaneously decode each word but rather has to wait for the leading 0
|
||||
of the next codeword.
|
||||
|
||||
|
||||
Further, a code is said to be \textit{efficient} if it has the smallest possible average word length, i.e. matches
|
||||
the entropy of the source alphabet.
|
||||
|
||||
\section{Kraft-McMillan inequality}
|
||||
The Kraft-McMillan inequality gives a necessary and sufficient condition for the existence of a prefix code.
|
||||
In the form shown in \autoref{eq:kraft-mcmillan} it is intuitive to understand given a code tree.
|
||||
Because prefix codes require code words to only be situated on the leaves of a code tree,
|
||||
for every code word $i$ using an alphabet of size $r$, it uses up exactly $r^{-l_i}$ of the available code words.
|
||||
The sum over all of them can thus never be larger than one else
|
||||
the code will not be uniquely decodable \cite{enwiki:kraft-mcmillan}.
|
||||
\begin{equation}
|
||||
\sum_l r^{-l_i} \leq 1
|
||||
\label{eq:kraft-mcmillan}
|
||||
\end{equation}
|
||||
|
||||
\section{Shannon-Fano}
|
||||
Shannon-Fano coding is one of the earliest methods for constructing prefix codes.
|
||||
It divides symbols into equal groups based on their probabilities, recursively partitioning them to assign shorter codewords
|
||||
to more frequent symbols.
|
||||
While intuitive, Shannon-Fano coding does not always achieve optimal compression,
|
||||
paving the way for more advanced techniques like Huffman coding.
|
||||
It is a top-down method that divides symbols into equal groups based on their probabilities,
|
||||
recursively partitioning them to assign shorter codewords to more frequent events.
|
||||
|
||||
\begin{algorithm}
|
||||
\begin{algorithmic}[1]
|
||||
\begin{algorithmic}
|
||||
\State first line
|
||||
\end{algorithmic}
|
||||
\label{alg:shannon-fano}
|
||||
\caption{Shannon-Fano compression algorithm}
|
||||
\caption{Shannon-Fano compression}
|
||||
\end{algorithm}
|
||||
|
||||
\section{Huffman Coding}
|
||||
@@ -136,6 +147,7 @@ The Lempel-Ziv-Welch (LZW) algorithm is a dictionary-based compression method th
|
||||
of recurring patterns in the data.
|
||||
Unlike entropy-based methods, LZW does not require prior knowledge of symbol probabilities,
|
||||
making it highly adaptable and efficient for a wide range of applications, including image and text compression.
|
||||
Because the dictionary does not have to be transmitted explicitly, LZW is also useful for streaming data.
|
||||
\cite{dewiki:lzw}
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user