Processing math: 0%

Saturday, October 15, 2022

Spitzer's Formula

The following identity is an abstract version of the Spitzer's formula introduced in Wendel's 1958 paper [1] and the proof thereof, with some simplifications.

Theorem. Let A be a unital Banach algebra with the norm and unit 1. Let \mathbf{P} be a bounded linear operator on \mathcal{A} satisfying the following conditions:

  1. \mathbf{P} is a projection, that is, \mathbf{P}^2 = \mathbf{P}.
  2. Both \operatorname{im}\mathbf{P} and \operatorname{im}(\mathbf{I}-\mathbf{P}) are closed subalgebras of \mathcal{A}.
  3. x and \mathbf{P}x commute for any x \in \mathcal{A}.
Then for any a \in \mathcal{A} and with the multiplication operator \mathbf{M}_a defined by \mathbf{M}_a(x) = ax, we have \sum_{n=0}^{\infty} t^n (\mathbf{P}\mathbf{M}_a)^n 1 = \exp\left(\sum_{n=1}^{\infty} \frac{t^n}{n} \mathbf{P}(a^n) \right) \tag{*} for any scalar t within the radius of convergence of both sides.

Proof. Write \mathbf{Q} = \mathbf{I} - \mathbf{P} for simplicity. We first note that, for any scalars a_n's and b_n's and x \in \mathcal{A}, \mathbf{Q} \left[ \sum_{n=0}^{\infty} a_n (\mathbf{P}x)^n \right] = \mathbf{Q}a_0 \qquad\text{and}\qquad \mathbf{P} \left[ \sum_{n=0}^{\infty} b_n (\mathbf{Q}x)^n \right] = \mathbf{P}b_0 \tag{1} provided the respective series converges. Now let t be sufficiently small so that all the series appearing in \text{(*)} converges absolutely, and let g denote the left-hand side of \text{(*)}: g = \sum_{n=0}^{\infty} t^n (\mathbf{P}\mathbf{M}_a)^n 1 = (\mathbf{I} - t \mathbf{P}\mathbf{M}_a)^{-1} 1 Hence, g is uniquely determined by the equation (\mathbf{I} - t \mathbf{P}\mathbf{M}_a) g = 1. \tag{2} In light of the above observation, it suffices to prove that the right-hand side of \text{(*)} satisfies the condition \text{(2)}. Indeed, let g^* denote the right-hand side of \text{(*)}: g^* = \exp\left(\sum_{n=1}^{\infty} \frac{t^n}{n} \mathbf{P}(a^n) \right) = \exp\left(- \mathbf{P} \log(1 - ta) \right) Then we get \begin{align*} (\mathbf{I} - t \mathbf{P}\mathbf{M}_a) g^* &= \mathbf{Q}g^* + \mathbf{P}(\mathbf{I} - t\mathbf{M}_a)g^* \\ &= \mathbf{Q}g^* + \mathbf{P}((1 - ta)g^*) \\ &= \mathbf{Q}\exp\left( -\mathbf{P} \log(1 - ta) \right) + \mathbf{P} \exp\left( \mathbf{Q} \log(1 - ta) \right) \\ &= \mathbf{Q}1 + \mathbf{P}1 \\ &= 1. \end{align*} In the third step, we utilized the fact that e^x e^y = e^{x+y} provided x and y commute. Then \text{(1)} is used in the fourth step. Therefore g^* = g. \square

References.

[1] Wendel, James G. (1958). "Spitzer's formula: A short proof". Proceedings of the American Mathematical Society. 9 (6): 905–908. doi:10.1090/S0002-9939-1958-0103531-2. MR 0103531.

Sunday, October 2, 2022

Dominated Convergence Theorem for Series

In this post, we will cover a classical but nevertheless powerful result about interchanging the order of limit and sum.

Let G be an abelian group. We will adopt additive notation for the group operation. Assume that G is endowed with a norm \|\cdot\| with respect to which G is complete.

Theorem. Let (a^{(n)}_k)_{n,k=1}^{\infty} \subseteq G and (M_k)_{k=1}^{\infty} \subseteq [0, \infty) satisfy the following conditions:

  1. \| a^{(n)}_k \| \leq M_k holds for each n and k;
  2. (a^{(n)}_k)_{n=1}^{\infty} converges in G as k \to \infty for each n;
  3. \sum_{k=1}^{\infty} M_k converges.
Then \sum_{k=1}^{\infty} a^{(n)}_k converges for each n, and \lim_{n\to\infty} \sum_{k=1}^{\infty} a^{(n)}_k = \sum_{k=1}^{\infty} \lim_{n\to\infty} a^{(n)}_k.

Remark. The completeness of G is not necessary provided all the sums appearing in the equality converge in G.

Proof. For each n, let S^{(n)} := \sum_{k=1}^{\infty} a^{(n)}_k . Then for p \lt q, we have \left\| \sum_{k=p+1}^{q} a^{(n)}_k \right\| \leq \sum_{k=p+q}^{q} \| a^{(n)}_k\| \leq \sum_{k=p+1}^{q} M_k. Since this bound converges to 0 as p, q \to \infty, the partial sums of S^{(n)} define a Cauchy sequence in G for each n. Then the completeness of G ensures that the series S^{(n)} converges in G for each n.

Define the sequence (a_k)_{k=1}^{\infty} in G by a_k := \lim_{n\to\infty} a^{(n)}_k for each k. Then it is clear that \| a_k \| \leq M_k, and so, the sum S := \sum_{k=1}^{\infty} a_k converges by the same reasoning. Finally, for each fixed p, \begin{align*} \| S^{(n)} - S \| &\leq \left\| S^{(n)} - \sum_{k=1}^{p} a_k \right\| + \left\| \sum_{k=1}^{p} a_k - S \right\| \\ &\leq \sum_{k=1}^{p} \| a^{(n)}_k - a_k \| + \sum_{k=p+1}^{\infty} \| a^{(n)}_k \| + \sum_{k=p+1}^{\infty} \| a_k \| \\ &\leq \sum_{k=1}^{p} \| a^{(n)}_k - a_k \| + 2 \sum_{k=p+1}^{\infty} M_k. \end{align*} Letting \limsup as n \to \infty, we obtain the bound \limsup_{n\to\infty} \| S^{(n)} - S \| \leq 2 \sum_{k=p+1}^{\infty} M_k. Since the left-hand side does not depend on p, letting p \to \infty shows that the limsup is in fact 0, proving S^{(n)} \to S in G as n \to \infty. This conclusion is equivalent to the desired assertion, hence the proof is done. \square

Example. We show that \sum_{k=1}^{\infty} \frac{2^k}{k} = 0 in \mathbb{Q}_2. Indeed, for n \geq 1 we have \begin{align*} 0 = \frac{1 - (1 - 2)^{2^n}}{2^n} &= \frac{1}{2^n} \sum_{k=1}^{\infty} (-1)^{k-1} \binom{2^n}{k} 2^k \\ &= \sum_{k=1}^{\infty} (-1)^{k-1} \binom{2^n - 1}{k - 1} \frac{2^k}{k} \\ &= \sum_{k=1}^{\infty} \left[ \prod_{l=1}^{k-1} \left( 1 - \frac{2^n}{l} \right) \right] \frac{2^k}{k}. \end{align*} It is easy to check that the |\cdot|_2-norm of the kth term is uniformly bounded by 2^{-k} for all n and k, and \prod_{l=1}^{k-1} \left( 1 - \frac{2^n}{l} \right) \to 1 in \mathbb{Q}_2 as n \to \infty for each k. So by DCT, it follows that 0 = \sum_{k=1}^{\infty} \left[ \lim_{n \to \infty} \prod_{l=1}^{k-1} \left( 1 - \frac{2^n}{l} \right) \right] \frac{2^k}{k} = \sum_{k=1}^{\infty} \frac{2^k}{k}.

Remark. The above result can be interpreted in terms of logarithm as follows: Note that the logarithm \log(z) = \sum_{n=1}^{\infty} \frac{(-1)^{n-1}}{n} z^n converges in \mathbb{Q}_2 whenever |z|_2 < 1, and moreover, \log(1+z) + \log(1+w) = \log((1+z)(1+w)), which initially holds as an identity between formal power series, extends to all z, w \in \mathbb{Q}_2 with |z|_2 < 1 and |w|_2 < 1. So, \sum_{n=1}^{\infty} \frac{2^n}{n} = - \log(1-2) = - \log (-1) = - \frac{1}{2} \log((-1)^2) = -\frac{1}{2}\log 1 = 0.

Tuesday, July 26, 2022

Product of two beta distributions

Theorem. Let p, q, r > 0. If X \sim \operatorname{Beta}(p, q) and Y \sim \operatorname{Beta}(p+q, r), then XY \sim \operatorname{Beta}(p, q+r).

Proof. Let T_p, T_q, T_r be independent random variables such that T_{\alpha} \sim \operatorname{Gamma}(\alpha) for each \alpha \in \{p, q, r\}. Then it is well-known that \frac{T_p}{T_p + T_q} \sim \operatorname{Beta}(p, q) \quad\text{and}\quad T_p + T_q \sim \operatorname{Gamma}(p+q). Moreover, these two random variables are independent. Using this, we may realize the joint distribution of X and Y via (X, Y) \stackrel{\text{d}}= \left( \frac{T_p}{T_p + T_q}, \frac{T_p + T_q}{T_p + T_q + T_r} \right). Therefore, we conclude XY \stackrel{\text{d}}= \frac{T_p}{T_p + T_q + T_r} \sim \operatorname{Gamma}(p+q+r) as desired. \square

Corollary. Let p, q > 0. If (X_n)_{n\geq 0} is a sequence of i.i.d. \operatorname{Beta}(p, q) variables, then \sum_{n=0}^{\infty} (-1)^n X_0 X_1 \cdots X_n \sim \operatorname{Beta}(p, p+q). This is Problem 6524 of American Mathematical Monthly, Vol.93, No.7, Aug.–Sep. 1986.

Proof. Let S denote the sum. By the alternating series estimation theorem, 0 \leq S \leq 1 holds almost surely. Moreover, if X\sim \operatorname{Beta}(p, q) is independent of (X_n)_{n\geq 0}, then S = X_0 \left(1 - \sum_{n=0}^{\infty} (-1)^n X_1 \cdots X_{n+1} \right) \stackrel{\text{d}}= X (1 - S). By noting that this equation uniquely determines all the moments of S, we find that this equation has a unique distributional solution by the uniqueness of the Hausdorff moment problem.

Moreover, if S' \sim \operatorname{Beta}(p, p+q) is independent of X, then 1-S' \sim \operatorname{Beta}(p+q, p). So by the theorem, X(1-S') \sim \operatorname{Beta}(p, p+q) \sim S'. This shows that S' is a distributional solution of the equation, hence by the uniquness, S \stackrel{\text{d}}= S'. \square

Thursday, July 21, 2022

A simple calculus question

Consider the problem of finding an antiderivative of \int \frac{1}{a + b \cos x} \, \mathrm{d}x, where |b| \lt a. It is often done by adopting the Weierstrass substitution t = \tan(x/2). The upshot of this substitution is \begin{align*} \int \frac{1}{a + b \cos x} \, \mathrm{d}x &= \int \frac{1}{a + b\left(\frac{1-t^2}{1+t^2}\right)} \, \frac{2\,\mathrm{d}t}{1+t^2} \\ &= \int \frac{2}{(a+b) + (a-b)t^2} \, \mathrm{d}t \\ &= \frac{2}{\sqrt{a^2-b^2}} \arctan\left(\sqrt{\frac{a-b}{a+b}} \tan \frac{x}{2} \right) + \mathsf{C}. \tag{1} \end{align*} However, there is one caveat in this result. Suppose we want to compute the definite integral \int_{0}^{2\pi} \frac{1}{a + b \cos x} \, \mathrm{d}x. Can we utilize the antiderivative \text{(1)} to find its value? Well, this is not possible, at least directly. This is because the substitution t = \tan(x/2) is discontinuous at every point of the form x = (2k+1)\pi, \qquad k \in \mathbb{Z}. Consequently, the formula \text{(1)} also suffers from discontinuity at those points. The usual trick to avoid this difficulty is to choose another 2\pi-interval on which \tan(x/2) is continuous. One such choice is the open interval (-\pi, \pi). So, \begin{align*} \int_{0}^{2\pi} \frac{1}{a + b \cos x} \, \mathrm{d}x &= \int_{-\pi}^{\pi} \frac{1}{a + b \cos x} \, \mathrm{d}x \\ &= \lim_{\varepsilon \to 0^+} \int_{-\pi+\varepsilon}^{\pi-\varepsilon} \frac{1}{a + b \cos x} \, \mathrm{d}x \\ &= \lim_{\varepsilon \to 0^+} \left[ \frac{2}{\sqrt{a^2-b^2}} \arctan\left(\sqrt{\frac{a-b}{a+b}} \tan \frac{x}{2} \right) \right]_{-\pi+\varepsilon}^{\pi-\varepsilon} \\ &= \frac{2\pi}{\sqrt{a^2-b^2}}. \end{align*}

So, is this the end of the story? In fact, it turns out that we can find an antiderivative that is vaild on all of \mathbb{R}:

Theorem. Let a, b \in \mathbb{R} satisfy |b| \lt a. Then \begin{align*} & \int \frac{1}{a + b\cos x} \, \mathrm{d}x \\ &= \frac{1}{\sqrt{a^2 - b^2}} \biggl[ x - 2 \arctan \biggl( \frac{b \sin x}{a + \sqrt{a^2-b^2} + b \cos x} \biggr) \biggr] + \mathsf{C}. \end{align*} This formula is valid on \mathbb{R}.

Proof. The Fourier series of the integrand is \frac{1}{a + b\cos x} = \frac{1}{\sqrt{a^2 - b^2}} \left( 1 + 2 \sum_{n=1}^{\infty} r^n \cos(nx) \right), where r is given by r = -\frac{b}{a + \sqrt{a^2 - b^2}} \in (-1, 1). Integrating both sides, we conclude \begin{align*} & \int \frac{1}{a + b\cos x} \, \mathrm{d}x \\ &= \frac{1}{\sqrt{a^2 - b^2}} \left( x + 2 \sum_{n=1}^{\infty} \frac{r^n}{n} \sin(nx) \right) + \mathsf{C} \\ &= \frac{1}{\sqrt{a^2 - b^2}} \left[ x - 2 \operatorname{Im}\left( \log(1 - re^{ix}) \right) \right] + \mathsf{C} \\ &= \frac{1}{\sqrt{a^2 - b^2}} \left[ x + 2 \arctan \left( \frac{r \sin x}{1 - r \cos x} \right) \right] + \mathsf{C}. \end{align*} Plugging the value of r into this, we are done.

Tuesday, July 19, 2022

A nasty integral

In this post, we discuss how certain integrals of rational functions can be computed.

Example 1. \int_{0}^{\infty} \frac{x^{8} - 4x^{6} + 9x^{4} - 5x^{2} + 1}{x^{12} - 10x^{10} + 37x^{8} - 42x^{6} + 26x^{4} - 8x^{2} + 1} \, \mathrm{d}x = \frac{\pi}{2}. This is Problem 11148 of American Mathematical Monthly, Vol.112, April 2005.

Proof. For each f(z) \in \mathbb{C}[z], we define f^*(z) = \overline{f(\bar{z})}. This is the same as taking conjugate to the coefficients appearing in f(z). Let \begin{align*} P(z) &= z^8-4 z^6+9 z^4-5 z^2+1, \\ Q(z) &= z^{12}-10 z^{10}+37 z^8-42 z^6+26 z^4-8 z^2+1. \end{align*} The key observation is that all the zeros of the polynomial q(z) = z^3-(2+i) z^2-(1-i) z+1 lies in the upper half-plane \mathbb{H} = \{ z \in \mathbb{C} : \operatorname{Im}(z) \gt 0\}, and Q(z) = q(z)q^*(z)q(-z)q^*(-z) is a factorization of Q(z) into coprime factors. Now let p(z) be the unique polynomial such that \deg p \lt \deg q and P(z) \equiv p(z) q^*(z)q(-z)q^*(-z) \pmod{q(z)}. Such p(z) exists because q(z) and q^*(z)q(-z)q^*(-z) are coprime. Also, such p(z) can be computed using the extended Euclidean algorithm. After a tedious computation, it follows that p(z) = \frac{-i z^2-(1-2 i) z+1}{4}. Then by the symmetry and the uniqueness of partial fraction decomposition, it follows that \frac{P(z)}{Q(z)} = \frac{p(z)}{q(z)} + \frac{p^*(z)}{q^*(z)} + \frac{p(-z)}{q(-z)} + \frac{p^*(-z)}{q^*(-z)}. Moreover, using the fact that all the zeros of q(z) lies in \mathbb{H}, we can invoke Cauchy's integral theorem to write \begin{align*} \operatorname{PV}\!\! \int_{-\infty}^{\infty} \frac{x^n}{q(x)} \, \mathrm{d}x &= \lim_{R\to\infty} \int_{-R}^{R} \frac{x^n}{q(x)} \, \mathrm{d}x \\ &= \lim_{R\to\infty} \int_{-\pi}^{0} \frac{i (Re^{i\theta})^{n+1}}{q(Re^{i\theta})} \, \mathrm{d}\theta \\ &= \begin{cases} 0, & n \lt \deg q - 1, \\ \pi i, & n = \deg q - 1. \end{cases} \end{align*} Therefore we conclude that \int_{-\infty}^{\infty} \frac{P(x)}{Q(x)} \, \mathrm{d}x = 4 \operatorname{Re}\left[ \operatorname{PV}\!\! \int_{-\infty}^{\infty} \frac{p(x)}{q(x)} \, \mathrm{d}x \right] = 4 \pi i \left( [z^2] p(z) \right) = \pi. The answer is half of the above integral, hence the claim follows. \square

Here is another example.

Example 2. \int_{0}^{\infty} \frac{x^{14} - 15x^{12} + 82x^{10} - 190x^{8} + 184x^{6} - 60x^{4} + 16x^{2}}{x^{16} - 20x^{14} + 156x^{12} - 616x^{10} + 1388x^{8} - 1792x^{6} + 1152x^{4} - 224x^{2} + 16} \, \mathrm{d}x = \frac{\pi}{2}.

Proof. Apply the same argument as above with the choices q(z) = z^4+(3-i) z^3-(1+3 i) z^2-(6+2 i) z-2 and p(z) = \frac{-i z^3-3 i z^2-2 i z}{4}. We assure the reader that all the zeros of q(z) lie in the upper half-plane, and the denominator of the integral admits the same form of factorization into coprime factors as before. \square

Sunday, July 10, 2022

Orthogonal decomposition using Gram determinants

Theorem. Let \mathcal{H} be a Hilbert space equipped with the inner product \langle \cdot, \cdot \rangle, and let w, v_1, \ldots, v_n \in \mathcal{H}. If G(w_1, \ldots, w_n) = \det [\langle w_i, w_j \rangle]_{i,j=1}^{n} denotes the Gram determinant, then the followings hold true:

  1. The vector w_{\perp} defined by the (formal) determinant w_{\perp} = \frac{1}{G(v_1,\ldots,v_n)}\begin{vmatrix} w & v_1 & \cdots & v_n \\ \langle v_1, w \rangle & \langle v_1, v_1 \rangle & \cdots & \langle v_1, v_n \rangle \\ \vdots & \vdots & \ddots & \vdots \\ \langle v_n, w \rangle & \langle v_n, v_1 \rangle & \cdots & \langle v_n, v_n \rangle \end{vmatrix} \tag{1} is orthogonal to V = \operatorname{span}\{v_1, \ldots, v_n\}, and in fact, w = w_{\perp} + (w - w_{\perp}) is the orthogonal decomposition of w.
  2. The square-distance between w and V is given by \|w_{\perp}\|^2 = \operatorname{dist}(w, V)^2 = \frac{G(w, v_1, \ldots, v_n)}{G(v_1, \ldots, v_n)}. \tag{2}

Proof. Extend the notation by letting G(\{v_i\}_{i=1}^{n}, \{w_i\}_{i=1}^{n}) = \det [\langle v_i, w_j \rangle]_{i,j=1}^{n}, and then define the linear functional \ell on \mathcal{H} by \ell(u) = G(\{u, v_1, \ldots, v_n\}, \{w, v_1, \ldots, v_n\}). By the Riesz representation theorem, there exists h \in \mathcal{H} such that \ell(u) = \langle h, u \rangle for all u \in \mathcal{H}. Since \ell(v_i) = 0 for all i = 1, \ldots, n, it follows that h is orthogonal to V. Moreover, expanding the determinant defining \ell(u) along the first line, we see that \begin{align*} \ell(u) &= G(v_1, \ldots, v_n) \langle u, w \rangle \\ &\quad + \sum_{i=1}^{n} (-1)^i G(\{w, v_1, \ldots, \widehat{v_i}, \ldots, v_n\}, \{v_1, \ldots, v_n\}) \langle u, v_i \rangle, \end{align*} and so, \begin{align*} h &= G(v_1, \ldots, v_n) w \\ &\quad + \sum_{i=1}^{n} (-1)^i G(\{w, v_1, \ldots, \widehat{v_i}, \ldots, v_n\}, \{v_1, \ldots, v_n\}) v_1, \end{align*} which is precisely the formal determant in \text{(1)}. This also implies that h is a linear combination of w, v_1, \ldots, v_n with the coefficient of w given by G(v_1, \ldots, v_n), hence w_{\perp} = \frac{h}{G(v_1, \ldots, v_n)} \in w + V. This and w_{\perp} \perp V together proves the first item of the theorem. For the second item, the equality \|w_{\perp}\| = \operatorname{dist}(w, V) is obvious by the orthogonality. Moreover, \|w_{\perp}\|^2 = \langle w_{\perp}, w_{\perp} \rangle = \langle w_{\perp}, w \rangle = \frac{\ell(w)}{G(v_1, \ldots, v_n)} = \frac{G(w, v_1, \ldots, v_n)}{G(v_1, \ldots, v_n)}. This completes the proof. \square

Friday, June 10, 2022

A tricky convergence problem

This is a slight simplification of the Math.SE user @Mike's wonderful solution [1] that is rephrased in my own words.

Theorem. Suppose

  1. (a_n)_{n\geq 1} is a non-negative, non-increasing sequence,
  2. (f(n))_{n\geq 1} is a sequence of positive integers such that f(n) \to \infty, and
  3. \sum_{n\geq 1} a_{f(n)} converges.
Then \sum_{n\geq 1} \frac{a_n}{f(n)} also converges.

Proof. We partition \mathbb{N}_1 = \{1, 2, \ldots\} into two subsets A and B, where \begin{align*} A &= \{ n \in \mathbb{N}_1 : f(n) > n \}, \\ B &= \{ n \in \mathbb{N}_1 : f(n) \leq n \}. \end{align*} Then it suffices to show that both \sum_{n\in A} \frac{a_n}{f(n)} and \sum_{n\in B} \frac{a_n}{f(n)} are finite.

First sum. Since (a_n)_{n\in\mathbb{N}_1} is non-increasing and 1 \leq f(n) \leq n for each n \in B, we get \sum_{n\in B} \frac{a_n}{f(n)} \leq \sum_{n \in B} a_{f(n)} \lt \infty.

Second sum. Now enumerate f(\mathbb{N}_1) as \{k_1 \lt k_2 \lt \ldots \}, and define A_i = A \cap [k_i, k_{i+1}) for each i. Then the following implication holds true: \begin{align*} n \in A_i \quad\implies\quad f(n) \gt n \geq k_i \quad\implies\quad f(n) \geq k_{i+1}. \end{align*} So it follows that \sum_{n \in A_i} \frac{a_n}{f(n)} \leq \sum_{n \in A_i} \frac{a_{k_i}}{k_{i+1}} = \frac{k_{i+1} - k_i}{k_{i+1}} a_{k_i} \leq a_{k_i}. Summing both sides over i, \begin{align*} \sum_{n \in A} \frac{a_n}{f(n)} &\leq \sum_{i=1}^{\infty} a_{k_i} \\ &\leq \sum_{i=1}^{\infty} \sum_{n \mathbin{:} f(n) = k_i} a_{f(n)} \\ &= \sum_{n=1}^{\infty} a_{f(n)} \lt \infty. \end{align*} Therefore the desired conclusion follows.


References.
  1. [1] Mike, If \sum_{n=1}^{\infty}{a_{[f(n)]}} converges, then \sum_{n=1}^{\infty}{\frac{a_n}{f(n)}} converges., URL (version: 2022-06-05): https://math.stackexchange.com/q/4464970

A property of an integrable function on \mathbb{R}

Theorem. Let f \in L^1(\mathbb{R}, \mathrm{d}x). Then \lim_{n\to\infty} f(nx) = 0 \quad \text{for a.e. } x.

The proof below is essentially what is shown in [1], written in my own words.

Proof. Fix \varepsilon \gt 0 and define E = \{x \in \mathbb{R} : |f(x)| \geq \varepsilon\}. Then for any [a, b] \subseteq (0, \infty), \begin{align*} \int_{a}^{b} \sum_{n=1}^{\infty} \mathbf{1}_{E}(nx) \, \mathrm{d}x &= \sum_{n=1}^{\infty} \frac{1}{n} \int_{na}^{nb} \mathbf{1}_{E}(t) \, \mathrm{d}t \\ &= \int_{E} \sum_{n=1}^{\infty} \frac{1}{n} \mathbf{1}_{\{na \leq t \leq nb\}} \, \mathrm{d}t \\ &= \int_{E} \sum_{n=1}^{\infty} \frac{1}{n} \mathbf{1}_{\{ t/b \leq n \leq t/a \}} \, \mathrm{d}t \\ &\leq \int_{E} (2 + \log(b/a)) \, \mathrm{d}t \\ &= (2 + \log(b/a)) |E|, \end{align*} where |E| denotes the Lebesgue measure of E. Since f is integrable, |E| \lt \infty. So the sum \sum_{n=1}^{\infty} \mathbf{1}_{E}(nx) is finite and hence nx \in E for only finitely many n for a.e. x in [a, b]. Since [a, b] is arbitrary, the same is true for a.e. x in \mathbb{R}. From this, it is now routine to conclude the desired claim. \square


References
  • [1] Emmanuel Lesigne. On the behavior at infinity of an integrable function. American Mathematical Monthly, Mathematical Association of America, 2010, 117 (2), pp.175-181. ⟨hal-00276738v3⟩

Tuesday, June 7, 2022

A simple but useful estimate

Theorem. Let f : \mathbb{R} \to \mathbb{R} be a continuous, integrable function of bounded variation. Then \sum_{n\in\mathbb{Z}} f(n) converges absolutely, and we have \left| \sum_{n\in \mathbb{Z}} f(n) - \int_{\mathbb{R}} f(x) \, \mathrm{d}x \right| \leq \frac{1}{2} V(f), where V(f) = \int_{\mathbb{R}} |\mathrm{d}f(x)| denotes the total variation of f over \mathbb{R}.

Proof. Let \tilde{B}(x) = x - \lfloor x \rfloor - \frac{1}{2}. Then performing integration by parts, for a \le b, \begin{align*} \int_{a}^{b} f(x) \, \mathrm{d}x - \sum_{n \in [a, b]\cap\mathbb{Z}} f(n) &= \int_{[a, b]} f(x) \, \mathrm{d}\tilde{B}(x) \\ &= [f(x)\tilde{B}(x)]_{a^-}^{b} - \int_{[a,b]} \tilde{B}(x) \, \mathrm{d}f(x). \end{align*} Using the fact that |\tilde{B}(x)| \leq \frac{1}{2}, it therefore follows that \begin{align*} \left| \int_{a}^{b} f(x) \, \mathrm{d}x - \sum_{n \in [a, b]\cap\mathbb{Z}} f(n) \right| \leq \frac{|f(a)| + |f(b)|}{2} + \frac{1}{2} \int_{[a,b]} \, |\mathrm{d}f(x)| \end{align*} From the assumption, it is easy to check that f(x) \to 0 as |x| \to \infty. So, the above estimate shows that \sum_{n\in\mathbb{Z}} f(n) satisfies the Cauchy criterion and hence converges as a \to -\infty and b \to \infty. Moreover, noting that |f| also satisfies the assumption in plcae of f, we find that this sum converges absolutely. Finally, passing to the limit as [a,b] \uparrow \mathbb{R} proves the desired bound. \square

Cauchy–Binet formula

Write [n] for the set \{1,2,\ldots,n\} and \binom{[n]}{k} for the family of all subets of [n] and size k. Also, for an m \times n matrix X, I \subseteq [m], and J \subseteq [n], let X_{I,J} denote the submatrix obtained from X by deleting row i for each i \notin I and column j for each j\notin [J].

Theorem. (Cauchy–Binet formula) Let A be a k\times n matrix and B be an n\times k matrix over the field F. Then \det(AB) = \sum_{S\subseteq\binom{[n]}{k}} \det(A_{[k],S}B_{S,[k]}).

1st Proof. Let (e_1,\ldots, e_m) denote the standard basis of F^m. (For convenience, we will abuse the notation so e_i stands for the ith standard basis vector in any of F^m, m \geq i.) Then \begin{align*} \det(AB)(e_1 \wedge \cdots \wedge e_k) &= \bigl( {\textstyle\bigwedge}^k AB \bigr)(e_1 \wedge \cdots \wedge e_k) \\ &= \bigl( {\textstyle\bigwedge}^k A \bigr)[(Be_1) \wedge \cdots \wedge (Be_k)]. \end{align*} Now by noting that Be_j = \sum_{i=1}^{n} B_{ij} e_i is the jth column of B, we may expand the exterior product in the last line as \begin{align*} &[(Be_1) \wedge \cdots \wedge (Be_k)] \\ &= \sum_{\substack{\rho : [k] \to [n] \\ \sigma \text{ injective} }} (B_{\rho(1),1} e_{\rho(1)}) \wedge \cdots \wedge (B_{\rho(k),k} e_{\rho(k)} ) \\ &= \sum_{\substack{\sigma : [k] \to [n] \\ \sigma \text{ increasing} }} \sum_{\substack{\tau : [k] \to [k] \\ \tau \text{ bijective}}} \left( \prod_{i=1}^{k} B_{\sigma(\tau(i)),i} \right) (e_{\sigma(\tau(1))} \wedge \cdots \wedge e_{\sigma(\tau(k))} ) \\ &= \sum_{\substack{\sigma : [k] \to [n] \\ \sigma \text{ increasing} }} \sum_{\substack{\tau : [k] \to [k] \\ \tau \text{ bijective}}} \operatorname{sgn}(\tau) \left( \prod_{i=1}^{k} B_{\sigma(\tau(i)),i} \right) (e_{\sigma(1)} \wedge \cdots \wedge e_{\sigma(k)} ) \\ &= \sum_{\substack{\sigma : [k] \to [n] \\ \sigma \text{ increasing} }} \det(B_{\sigma([k]),[k]}) (e_{\sigma(1)} \wedge \cdots \wedge e_{\sigma(k)} ) \end{align*} Plugginb this back, we therefore get \begin{align*} &\det(AB)(e_1 \wedge \cdots \wedge e_k) \\ &= \sum_{\substack{\sigma : [k] \to [n] \\ \sigma \text{ increasing} }} \det(B_{\sigma([k]),[k]}) [(Ae_{\sigma(1)}) \wedge \cdots \wedge (Ae_{\sigma(k)}) ] \\ &= \sum_{\substack{\sigma : [k] \to [n] \\ \sigma \text{ increasing} }} \det(B_{\sigma([k]),[k]}) \det(A_{[k],\sigma([k])}) (e_1 \wedge \cdots \wedge e_k) \\ &= \sum_{S \in \binom{[n]}{k}} \det(A_{[k],S}) \det(B_{S,[k]}) (e_1 \wedge \cdots \wedge e_k). \end{align*} This completes the proof. \square

2nd Proof. We specialize in the case F = \mathbb{C}. Also, for each index set I, we abbreviate \wedge_{i\in I} e_i = \wedge I. Then using inner product between multivectors, we get \begin{align*} \det(AB) &= \left\langle (\wedge^k A^*)(\wedge [k]), (\wedge^k B)(\wedge [k]) \right\rangle \\ &= \sum_{S \in \binom{[n]}{k}} \left\langle (\wedge^k A^*)(\wedge [k]), \wedge S \right\rangle \left\langle \wedge S, (\wedge^k B)(\wedge [k]) \right\rangle \\ &= \sum_{S \in \binom{[n]}{k}} \det(A_{[k],S})\det(B_{S,[k]}) \end{align*}

A similar technique shows:

Proposition. (Coefficients of a characteristic Polynomial) Let A be an n\times n matrix. Then \det(zI_n + A) = \sum_{k=0}^{n} z^{n-k} \sum_{S\subseteq\binom{[n]}{k}} \det(A_{S,S}).

Proposition. Let a_1, \ldots, a_p \in \mathbb{C} and v_1, \ldots, v_p \in \mathbb{C}^n. Then \det\left( \sum_{j=1}^{p} a_j v_j v_j^{*} \right) = \sum_{\{j_1 \lt \ldots \lt j_n \} \subseteq [p]} (a_{j_1} \cdots a_{j_n}) \left| \det( v_{j_1}, \ldots, v_{j_n} ) \right|^2.

Determinant of a Cauchy matrix

Throughout this posting, the expressions \sum_{\mathit{vars} \mathop{:} \mathit{conds}} f(\mathit{vars}) \qquad\text{and}\qquad \prod_{\mathit{vars} \mathop{:} \mathit{conds}} f(\mathit{vars}) will refer to the sum and product of f(\mathit{vars}) over all possible values of \mathit{vars} subject to the constraints \mathit{conds}, respectively.

Theorem. (Determinant of a Cauchy matrix) Let A be an n\times n matrix with elements a_{ij} in the form a_{ij} = \frac{1}{x_i + y_j}. (Think of A as an matrix over the field of rational functions in variables x_i's and y_j's.) Then \det A = \frac{\prod_{i, j \mathbin{:} i \lt j} (x_i - x_j)(y_i - y_j)}{\prod_{i, j} (x_i + y_j)}.

Proof. We prove the claim by induction on n. The base case is clearly true, so it suffices to establish the inductive step.

Suppose that the claim holds for all (n-1)\times(n-1) matrices. Also, recall that if p(z) is a polynomial of degree less than n and if \alpha_1, \ldots, \alpha_n are distinct, then we have the partial fraction decomposition \frac{p(z)}{\prod_{j} (z - \alpha_j)} = \sum_{k} \frac{p(\alpha_k)}{(z - \alpha_k) \prod_{j \mathbin{:} j \neq k} (\alpha_k - \alpha_j)}. So, by regarding x_n as the variable and applying the partial fraction decomposition, \begin{align*} &\frac{\prod_{i \mathbin{:} i \lt n} (x_i - x_n)}{\prod_{j} (x_n + y_j)} \\ &= \sum_{k} \frac{\prod_{i \mathbin{:} i \lt n} (x_i + y_k)}{(x_n + y_k) \prod_{j \mathbin{:} j \neq k} (y_j - y_k)} \\ &= \sum_{k} (-1)^{n-k} \frac{\prod_{i \mathbin{:} i \lt n} (x_i + y_k)}{(x_n + y_k) \bigl[ \prod_{j \mathbin{:} j \lt k} (y_j - y_k) \bigr]\bigl[ \prod_{j \mathbin{:} k \lt j} (y_k - y_j) \bigr]} \end{align*} From this, it follows that \begin{align*} &\frac{\prod_{i, j \mathbin{:} i \lt j} (x_i - x_j)(y_i - y_j)}{\prod_{i, j} (x_i + y_j)} \\ &= \frac{\bigl[ \prod_{i, j \mathbin{:} i \lt j \lt n} (x_i - x_j) \bigr]\bigl[ \prod_{i, j \mathbin{:} i \lt j} (y_i - y_j) \bigr]}{\prod_{i, j \mathbin{:} i \lt n} (x_i + y_j)} \cdot \frac{\prod_{i \mathbin{:} i \lt n} (x_i - x_n)}{\prod_{j} (x_n + y_j)} \\ &= \sum_{k} \frac{(-1)^{n-k}}{x_n + y_k} \cdot \frac{\bigl[ \prod_{i, j \mathbin{:} i \lt j \lt n} (x_i - x_j) \bigr]\bigl[ \prod_{i, j \mathbin{:} i \lt j; i \neq k; j \neq k} (y_i - y_j) \bigr]}{\prod_{i, j \mathbin{:} i \lt n; j\neq k} (x_i + y_j)} \end{align*} However, the last line is precisely the cofactor expansion of \det A along the nth row, where the determinants of the submatrices are computed using the inductive hypothesis. Therefore the inductive step is established and the proof is done. \square

Sunday, January 2, 2022

Dominated Convergence Theorem Revisited

Theorem. Let (X, \mathcal{F}, \mu) be a measure space, f_n : X \to [0, \infty] a sequence of measurable functions converging almost everywhere to a measurable function f so that f_n \leq f for all n. Then \lim_{n\to\infty} \int_{X} f_n \, \mathrm{d}\mu = \int_{X} f \, \mathrm{d}\mu.

Note that f need not be integrable and (f_n) need not be monotone increasing.

Proof. By the Fatou's lemma, \int_{X} f \, \mathrm{d}\mu = \int_{X} \varliminf_{n\to\infty} f_n \, \mathrm{d}\mu \leq \varliminf_{n\to\infty} \int_{X} f_n \, \mathrm{d}\mu \leq \varlimsup_{n\to\infty} \int_{X} f_n \, \mathrm{d}\mu \leq \int_{X} f \, \mathrm{d}\mu.