[Home]History of Data compression/entropy

HomePage | Recent Changes | Preferences

Revision 4 . . December 2, 2001 7:51 pm by Boaz
Revision 3 . . July 10, 2001 2:56 am by TedDunning
  

Difference (from prior major revision) (no other diffs)

Changed: 3c3
Conceptually, entropy is the actual amount of (information theoretic) infomation in a piece of data. This is not necessarily the same as the number of bits in the data, for two main reasons.
Conceptually, entropy is the actual amount of (information theoretic) infomation in a piece of data. Entirely random ascii data has an entropy of about 8, since you never know what the next character will be. A long string of A's has an entropy of 0, since you know that the next character will always be an 'A'. The entropy of english text is about 1.5 (Try compressing it with PPM!) The entropy of a data source means the average number of bits per character needed to encode it.

Changed: 9,10c9
A simple operational definition of entropy would be the size of the data if it was compressed using an ideal form of
Data compression.
Entropy is effectively the lowest compression possible. This is useful for determining whether a compression algorithm has any significant advantages of another for a data source. The definition of entropy is based on the [Markov Model]? of text. For an order-0 (each character is selected independent of the last characters) source, the entropy is:


Changed: 12c11,17
Another definition would be to take the logarithm of the number of different states that an object MIGHT be in. (This is only right if all the states are equally likely. If some are more likely than others the definition is messier.)
H(S)=-Σ pi log2 pi


Where pi is the probability of i. For a higher order markov source (one in which probabilities are dependent on the preceding characters), the entropy is:


H(S)=ΣiΣj Pi pi (j) log2 pi (j)


Where i is a state (certain preceding characters) and pi (j) is the probability of j given i as the previous character (s).

Added: 18a24,26

Fix? Well I tried... I think I just mangled it.


HomePage | Recent Changes | Preferences
Search: