Instead of hijacking another topic, I'll start this topic.
David BL suggested that a way to quantify information is the number of bits
needed to encode, allowing for compression. I said I preferred entropy as
the measure of information, and suggested that the two measures might in
some way be equivalent. Someone else recalled the concept of entropy from
the study of statistical mechanics and thermodynamics in physics.
The use of the word "entropy" definitiely comes from physics. The formulas
that Shannon (qv) determined for the amount of information in a message bear
an uncanny resemblance to the formulas for entropy in physics. I personally
don't believe that this is mere coincidence.
Next, I talked a little about entropy being context sensitive. There is a
much more precise way than mine of stating this, but I've forgotten what it
is. I think it goes something like this: the self information in a
message is context insensitive but the mutual information between a pair of
messages is dependent on both of them. My use of the word "database" in
place of message in my earlier comment is a little sloppy in this regard.
Next, the use of logarithms, and the number of bits needed to store
(represent?) a message are closely related. If we have 8 equally likely
messages, it will take precisely 3 bits to store one message. Note that
log base 2 of 8 is 3. If you use the formulas for entropy, and use 2 as
the base for the logarithms, the unit of information comes out to be
precisely the bit.
There is a form of compression called Huffman encoding on an ensemble of
messages that takes advantage of mathematical expectation to compress, on
the average, an ensemble of messages when the eight possible messages are
not a priori equally likely. If for example, message number 1 is 50%
likely, you and use a message consisting of 1 bit to represent it, and use
somewhat longer messages for the other 7 possible messages. I don't want to
belabor the point in cdt, but there are ample references for Huffamn
encoding on the web. If you study Huffman coding in detail, you'll start
to get logarithms into the picture.
All of this goes back to the 1960s, and some of it to the 1940s. Is entropy
still widely used in information science? Is it relevant to database
theory?
Tegiri Nenashi - 11 Jan 2008 19:40 GMT
> Is entropy
> still widely used in information science? Is it relevant to database
> theory?
http://www.sigmod.org/pods/proc00/papers/dalkilic.pdf
Joe Thurbon - 12 Jan 2008 00:38 GMT
> Instead of hijacking another topic, I'll start this topic.
>
> David BL suggested that a way to quantify information is the number of bits
> needed to encode, allowing for compression.
I think the term being looked for is
http://en.wikipedia.org/wiki/Kolmogorov_complexity
There's been a _lot_ of work done in this area.
> I said I preferred entropy as
> the measure of information, and suggested that the two measures might in
> some way be equivalent. Someone else recalled the concept of entropy from
> the study of statistical mechanics and thermodynamics in physics.
It also gets a good work out in machine learning. Many of the classical
algorithms (decision trees in their various forms) have
information-gain (which is defined in terms of entropy) at their heart.
(The precise definition is closely related to the number of bits
required for representation)
I vaguely recall that some of the more theoretical machine learning
results (like learnability and optimality results) rely on a notion entropy.
[... snip interesting perspective ...]
> All of this goes back to the 1960s, and some of it to the 1940s. Is entropy
> still widely used in information science?
Yes.
> Is it relevant to database theory?
Good question.
Cheers,
Joe
David Cressey - 12 Jan 2008 09:17 GMT
[snip good stuff]
> I vaguely recall that some of the more theoretical machine learning
> results (like learnability and optimality results) rely on a notion entropy.
I once wrote a little program to play Mastermind (a code guessing game)
using entropy to measure a move's worth. Mastermind can be played using a
brute force tree search. But my program tried out only about 10 moves, and
picked the one with the best entropy. Its play was only slightly inferior
to the play of a brute force tree searcher.