Lemma. Let f:Z+→R satisfy the following two conditions
- f(mn)=f(m)+f(n) for all m,n∈Z+,
- limn→∞(f(n+1)−f(n))=0.
Proof. Define δi=max{|f(n+1)−f(n)|:2i≤n<2i+1}. By the assumption, we know that δi→0 as i→∞. Now let n≥2, and for each k≥1 we write nk=Lk∑j=0aj2j,aj∈{0,1}andaLk=1. Then by using the fact that f(aLk2Lk)=Lkf(2), we can write |f(nk)Lk−f(2)|≤1LkLk−1∑l=0|f(Lk∑j=laj2k)−f(Lk∑j=l+1aj2k)|=1LkLk−1∑l=0|f(Lk∑j=laj2k−l)−f(Lk∑j=l+1aj2k−l)|≤1LkLk−1∑l=0δLk−l=1LkLk∑i=1δi. Since Lk=⌊klog2n⌋ tends to ∞ as k→∞, by Stolz–Cesàro theorem, this bound converges to 0. So f(2)=limk→∞f(nk)Lk=limk→∞kf(n)⌊klog2n⌋=f(n)log2n, and f(n)=clogn with c=f(2)/log2. When n=1, we have f(1)=f(12)=2f(1) and hence f(1)=0=clog1, completing the proof. ////
This lemma is the key step for proving the theorem of Faddeev. To state this theorem, let Δn={(p1,⋯,pn)∈[0,∞)n:p1+⋯+pn=1} and Prob=⋃∞n=1Δn the set of all probability vectors.
Theorem. (Faddeev)[1] Let H:Prob→[0,∞) satisfy the following properties
- H is a symmetric function on each Δn.
- H is continuous on Δ2.
- H(12,12)=1.
- H(tp1,(1−t)p1,p2,⋯,pn)=H(p1,⋯,pn)+p1H(t,1−t).
Proof. We extend H onto the cone {λp:p∈Prob,λ>0} by setting H(λp)=λH(p),∀p∈Prob,λ>0. Then the condition 4 can be rephrased as H(p0,p1,p2,⋯,pn)=H(p0+p1,p2,⋯,pn)+H(p0,p1). By applying (1) repeatedly, we find that H(p1,⋯,pn)=n∑k=2H(p1+⋯+pk−1,pk). This tells that H is continuous on each Δn as well. By the same argument, we also find that H(p1,⋯,pn,q1,⋯,qm)=H(p1+⋯+pn,q1,⋯,qm)+n∑k=2H(p1+⋯+pk−1,pk)=H(p1+⋯+pn,q1,⋯,qm)+H(p1,⋯,pn), which generalizes the property (1). Now define f(n)=H(1n,⋯,1n). By applying (2), we obtain f(mn)=f(m)+f(n). Next we show that f(n+1)−f(n)→0 as n→∞. In view of Stolz–Cesàro theorem together with f(2n)/2n=n/2n→0, we know that the only candidate for the limit is zero. Also, by (2) we easily find that (n+1)f(n+1)=H(n,1)+nf(n). So nf(n) is increasing and 0≤nf(n)≤2⌈log2n⌉f(2⌈log2n⌉)=2⌈log2n⌉⌈log2n⌉. This tells that f(n)=O(logn) and hence f(n+1)−f(n)=H(nn+1,1n)−1n+1f(n) converges as n→∞ by the continuity of H on Δ2. So this limit must be zero, and by the lemma, f(n)=log2n. Now if (a1N,⋯,anN)∈Δn, then by (2), H(a1N,⋯,anN)=f(N)−N∑k=1akNf(ak)=−N∑k=1akNlog2akN and hence H equals the Shannon entropy for probability vectors with rational coordinates. Then the general case follows by the continuity and the proof is complete.
References.
- [1] Rényi, Alfréd. On measures of entropy and information. HUNGARIAN ACADEMY OF SCIENCES Budapest Hungary, 1961.
Could you recommend the books for improving mathematical skills especially for calculus and integrations?
ReplyDelete