问题
In a note I found this phrase:
Using isolated symbol probabilities of English language, you can find out the entropy of the language.
What is actually meant by "isolated symbol probabilities"? This is related to the entropy of an information source.
回答1:
It would be helpful to know where the note came from and what the context is, but even without that I am quite sure this simply means that they use the frequency of individual symbols (e.g. characters) as the basis for entropy, rather than for example the joint probability (of character sequences), or the conditional probability (of one particular character to follow another).
So if you have an alphabet X={a,b,c,...,z} and a probability P(a), P(b),... for each character to appear in text (e.g. based on the frequency found in a data example), you'd compute the entropy by computing -P(x) * log(P(x)) for each character x individually and then taking the sum of all. Then, obviously, you'd have used the probability of each character in isolation, rather than the probability of each character in context.
Note, however, that the term symbol in the note you found does not necessarily refer to characters. It might refer to words or other units of text. Nevertheless, the point they are making is that they apply the classical formula for entropy to probabilities of individual events (characters, words, whatever), not probabilities of complex or conditional events.
来源:https://stackoverflow.com/questions/9564979/what-is-the-meaning-of-isolated-symbol-probabilities-of-english