Loading [MathJax]/jax/output/HTML-CSS/jax.js

Tuesday, April 24, 2018

Some characterization of the Shannon entropy

Lemma. Let f:Z+R satisfy the following two conditions

  • f(mn)=f(m)+f(n) for all m,nZ+,
  • limn(f(n+1)f(n))=0.
Then there exists cR such that f(n)=clogn for all nZ+.

Proof. Define δi=max{|f(n+1)f(n)|:2in<2i+1}. By the assumption, we know that δi0 as i. Now let n2, and for each k1 we write nk=Lkj=0aj2j,aj{0,1}andaLk=1. Then by using the fact that f(aLk2Lk)=Lkf(2), we can write |f(nk)Lkf(2)|1LkLk1l=0|f(Lkj=laj2k)f(Lkj=l+1aj2k)|=1LkLk1l=0|f(Lkj=laj2kl)f(Lkj=l+1aj2kl)|1LkLk1l=0δLkl=1LkLki=1δi. Since Lk=klog2n tends to as k, by Stolz–Cesàro theorem, this bound converges to 0. So f(2)=limkf(nk)Lk=limkkf(n)klog2n=f(n)log2n, and f(n)=clogn with c=f(2)/log2. When n=1, we have f(1)=f(12)=2f(1) and hence f(1)=0=clog1, completing the proof. ////

This lemma is the key step for proving the theorem of Faddeev. To state this theorem, let Δn={(p1,,pn)[0,)n:p1++pn=1} and Prob=n=1Δn the set of all probability vectors.

Theorem. (Faddeev)[1] Let H:Prob[0,) satisfy the following properties

  1. H is a symmetric function on each Δn.
  2. H is continuous on Δ2.
  3. H(12,12)=1.
  4. H(tp1,(1t)p1,p2,,pn)=H(p1,,pn)+p1H(t,1t).
Then H is the Shannon entropy: H(p1,,pn)=ni=1pilog2pi.

Proof. We extend H onto the cone {λp:pProb,λ>0} by setting H(λp)=λH(p),pProb,λ>0. Then the condition 4 can be rephrased as H(p0,p1,p2,,pn)=H(p0+p1,p2,,pn)+H(p0,p1). By applying (1) repeatedly, we find that H(p1,,pn)=nk=2H(p1++pk1,pk). This tells that H is continuous on each Δn as well. By the same argument, we also find that H(p1,,pn,q1,,qm)=H(p1++pn,q1,,qm)+nk=2H(p1++pk1,pk)=H(p1++pn,q1,,qm)+H(p1,,pn), which generalizes the property (1). Now define f(n)=H(1n,,1n). By applying (2), we obtain f(mn)=f(m)+f(n). Next we show that f(n+1)f(n)0 as n. In view of Stolz–Cesàro theorem together with f(2n)/2n=n/2n0, we know that the only candidate for the limit is zero. Also, by (2) we easily find that (n+1)f(n+1)=H(n,1)+nf(n). So nf(n) is increasing and 0nf(n)2log2nf(2log2n)=2log2nlog2n. This tells that f(n)=O(logn) and hence f(n+1)f(n)=H(nn+1,1n)1n+1f(n) converges as n by the continuity of H on Δ2. So this limit must be zero, and by the lemma, f(n)=log2n. Now if (a1N,,anN)Δn, then by (2), H(a1N,,anN)=f(N)Nk=1akNf(ak)=Nk=1akNlog2akN and hence H equals the Shannon entropy for probability vectors with rational coordinates. Then the general case follows by the continuity and the proof is complete.

References.

  • [1] Rényi, Alfréd. On measures of entropy and information. HUNGARIAN ACADEMY OF SCIENCES Budapest Hungary, 1961.

1 comment:

  1. Could you recommend the books for improving mathematical skills especially for calculus and integrations?

    ReplyDelete