This commit is contained in:
eneller
2025-11-26 18:25:10 +01:00
parent 31a3e956ab
commit 520d3f8bc6
2 changed files with 111 additions and 14 deletions

View File

@@ -6,6 +6,8 @@
\usepackage{subcaption}
\usepackage{parskip} % dont indent after paragraphs, figures
\usepackage{xcolor}
\usepackage{algorithm}
\usepackage{algpseudocodex}
%\usepackage{csquotes} % Recommended for biblatex
\usepackage{tikz}
\usepackage{pgfplots}
@@ -40,22 +42,58 @@ The concept of entropy is closely related to the design of efficient codes.
\end{equation}
The understanding of entropy as the expected information $E(I)$ of a message provides an intuition that,
given a source with a given entropy (in bits), any coding can not have a lower average word length (in bits)
\begin{equation}
E(l) = \sum_i p_i l_i
\end{equation}
than this entropy without losing information.
This is the content of Shannons's source coding theorem \cite{enwiki:shannon-source-coding}.
This is the content of Shannons's source coding theorem,
introduced in \citeyear{shannon1948mathematical} \cite{enwiki:shannon-source-coding}.
In his paper, \citeauthor{shannon1948mathematical} proposed two principal ideas to minimize the average length of a code.
The first is to use short codes for symbols with higher probability.
This is an intuitive approach as more frequent symbols have a higher impact on average code length.
% https://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem
\section{Kraft-McMillan inequality}
% https://de.wikipedia.org/wiki/Kraft-Ungleichung
% https://en.wikipedia.org/wiki/Kraft%E2%80%93McMillan_inequality
\section{Shannon-Fano}
% https://de.wikipedia.org/wiki/Shannon-Fano-Kodierung
Shannon-Fano coding is one of the earliest methods for constructing prefix codes.
It divides symbols into groups based on their probabilities, recursively partitioning them to assign shorter codewords
to more frequent symbols.
While intuitive, Shannon-Fano coding does not always achieve optimal compression,
paving the way for more advanced techniques like Huffman coding.
\begin{algorithm}
\begin{algorithmic}[1]
\State first line
\end{algorithmic}
\label{alg:shannon-fano}
\caption{Shannon-Fano compression algorithm}
\end{algorithm}
\section{Huffman Coding}
% https://de.wikipedia.org/wiki/Huffman-Kodierung
\section{LZW Algorithm}
% https://de.wikipedia.org/wiki/Lempel-Ziv-Welch-Algorithmus
Huffman coding is an optimal prefix coding algorithm that minimizes the expected codeword length
for a given set of symbol probabilities.
By constructing a binary tree where the most frequent symbols are assigned the shortest codewords,
Huffman coding achieves the theoretical limit of entropy for discrete memoryless sources.
Its efficiency and simplicity have made it a cornerstone of lossless data compression.
\section{Arithmetic Coding}
% https://en.wikipedia.org/wiki/Arithmetic_coding
Arithmetic coding is a modern compression technique that encodes an entire message as a single interval
within the range $[0, 1)$.
By iteratively refining this interval based on the probabilities of the symbols in the message,
arithmetic coding can achieve compression rates that approach the entropy of the source.
Its ability to handle non-integer bit lengths makes it particularly powerful
for applications requiring high compression efficiency.
\section{LZW Algorithm}
The Lempel-Ziv-Welch (LZW) algorithm is a dictionary-based compression method that dynamically builds a dictionary
of recurring patterns in the data.
Unlike entropy-based methods, LZW does not require prior knowledge of symbol probabilities,
making it highly adaptable and efficient for a wide range of applications, including image and text compression.
\cite{dewiki:lzw}
\printbibliography
\end{document}