update
This commit is contained in:
@@ -77,4 +77,11 @@
|
||||
year = "2018",
|
||||
url = "https://de.wikipedia.org/w/index.php?title=Kraft-Ungleichung&oldid=172862410",
|
||||
note = "[Online; Stand 26. November 2025]"
|
||||
}
|
||||
@misc{ dewiki:partition,
|
||||
author = "Wikipedia",
|
||||
title = "Partitionsproblem --- Wikipedia{,} die freie Enzyklopädie",
|
||||
year = "2025",
|
||||
url = "https://de.wikipedia.org/w/index.php?title=Partitionsproblem&oldid=255787013",
|
||||
note = "[Online; Stand 26. November 2025]"
|
||||
}
|
||||
@@ -36,31 +36,41 @@ As the volume of data grows exponentially around the world, compression is only
|
||||
Not only does it enable the storage of large amounts of information needed for research in scientific domains
|
||||
like DNA sequencing and analysis, it also plays a vital role in keeping stored data accessible by
|
||||
facilitating cataloging, search and retrieval.
|
||||
The concept of entropy is closely related to the design of efficient codes.
|
||||
|
||||
The concept of entropy introduced in the previous entry is closely related to the design of efficient codes for compression.
|
||||
\begin{figure}[H]
|
||||
\begin{minipage}{0.5\textwidth}
|
||||
\begin{equation}
|
||||
H = E(I) = - \sum_i p_i \log_2(p_i)
|
||||
\label{eq:entropy-information}
|
||||
\end{equation}
|
||||
\end{minipage}
|
||||
\begin{minipage}{0.5\textwidth}
|
||||
\begin{equation}
|
||||
H = E(I) = - \sum_i p_i \log_2(p_i)
|
||||
\label{eq:entropy-information}
|
||||
E(L) = \sum_i p_i l_i
|
||||
\label{eq:expected-codelength}
|
||||
\end{equation}
|
||||
\end{minipage}
|
||||
\end{figure}
|
||||
In coding theory, the events of an information source are to be encoded in a manner that minimizes the bits needed to store
|
||||
the information provided by the source.
|
||||
|
||||
The understanding of entropy as the expected information $E(I)$ of a message provides an intuition that,
|
||||
given a source with a given entropy (in bits), any coding can not have a lower average word length (in bits)
|
||||
|
||||
\begin{equation}
|
||||
E(l) = \sum_i p_i l_i
|
||||
\end{equation}
|
||||
than this entropy without losing information.
|
||||
This is the content of Shannons's source coding theorem,
|
||||
introduced in \citeyear{shannon1948mathematical} \cite{enwiki:shannon-source-coding}.
|
||||
In his paper, \citeauthor{shannon1948mathematical} proposed two principal ideas to minimize the average length of a code.
|
||||
The first is to use short codes for symbols with higher probability.
|
||||
This is an intuitive approach as more frequent symbols have a higher impact on average code length.
|
||||
|
||||
The second idea is to encode events that frequently occur together at the same time, allowing for greater flexibility
|
||||
in code design.
|
||||
|
||||
\section{Kraft-McMillan inequality}
|
||||
|
||||
\section{Shannon-Fano}
|
||||
Shannon-Fano coding is one of the earliest methods for constructing prefix codes.
|
||||
It divides symbols into groups based on their probabilities, recursively partitioning them to assign shorter codewords
|
||||
It divides symbols into equal groups based on their probabilities, recursively partitioning them to assign shorter codewords
|
||||
to more frequent symbols.
|
||||
While intuitive, Shannon-Fano coding does not always achieve optimal compression,
|
||||
paving the way for more advanced techniques like Huffman coding.
|
||||
|
||||
@@ -348,7 +348,7 @@ The capacity of the binary symmetric channel is given by:
|
||||
where $H_2(p) = -p \log_2(p) - (1-p)\log_2(1-p)$ is the binary entropy function.
|
||||
As $p$ increases, uncertainty grows and channel capacity declines.
|
||||
When $p = 0.5$, output bits are completely random and no information can be transmitted ($C = 0$).
|
||||
As already shown in \autoref{fig:graph-entropy}, an error rate over $p > 0.5$ is equivalent to $ 1-p < 0.5$,
|
||||
As illustrated in \autoref{fig:graph-entropy}, an error rate over $p > 0.5$ is equivalent to $ 1-p < 0.5$,
|
||||
though not relevant in practice.
|
||||
|
||||
Shannon’s theorem is not constructive as it does not provide an explicit method for constructing such efficient codes,
|
||||
|
||||
Reference in New Issue
Block a user