Tuesday, July 26, 2022

Product of two beta distributions

Theorem. Let $p, q, r > 0$. If $X \sim \operatorname{Beta}(p, q)$ and $Y \sim \operatorname{Beta}(p+q, r)$, then $$ XY \sim \operatorname{Beta}(p, q+r). $$

Proof. Let $T_p, T_q, T_r$ be independent random variables such that $T_{\alpha} \sim \operatorname{Gamma}(\alpha)$ for each $\alpha \in \{p, q, r\}$. Then it is well-known that $$ \frac{T_p}{T_p + T_q} \sim \operatorname{Beta}(p, q) \quad\text{and}\quad T_p + T_q \sim \operatorname{Gamma}(p+q). $$ Moreover, these two random variables are independent. Using this, we may realize the joint distribution of $X$ and $Y$ via $$ (X, Y) \stackrel{\text{d}}= \left( \frac{T_p}{T_p + T_q}, \frac{T_p + T_q}{T_p + T_q + T_r} \right). $$ Therefore, we conclude $$ XY \stackrel{\text{d}}= \frac{T_p}{T_p + T_q + T_r} \sim \operatorname{Gamma}(p+q+r) $$ as desired. $\square$

Corollary. Let $p, q > 0$. If $(X_n)_{n\geq 0}$ is a sequence of i.i.d. $\operatorname{Beta}(p, q)$ variables, then $$ \sum_{n=0}^{\infty} (-1)^n X_0 X_1 \cdots X_n \sim \operatorname{Beta}(p, p+q). $$ This is Problem 6524 of American Mathematical Monthly, Vol.93, No.7, Aug.–Sep. 1986.

Proof. Let $S$ denote the sum. By the alternating series estimation theorem, $0 \leq S \leq 1$ holds almost surely. Moreover, if $X\sim \operatorname{Beta}(p, q)$ is independent of $(X_n)_{n\geq 0}$, then $$ S = X_0 \left(1 - \sum_{n=0}^{\infty} (-1)^n X_1 \cdots X_{n+1} \right) \stackrel{\text{d}}= X (1 - S). $$ By noting that this equation uniquely determines all the moments of $S$, we find that this equation has a unique distributional solution by the uniqueness of the Hausdorff moment problem.

Moreover, if $S' \sim \operatorname{Beta}(p, p+q)$ is independent of $X$, then $1-S' \sim \operatorname{Beta}(p+q, p)$. So by the theorem, $$ X(1-S') \sim \operatorname{Beta}(p, p+q) \sim S'.$$ This shows that $S'$ is a distributional solution of the equation, hence by the uniquness, $S \stackrel{\text{d}}= S'$. $\square$

Thursday, July 21, 2022

A simple calculus question

Consider the problem of finding an antiderivative of $$ \int \frac{1}{a + b \cos x} \, \mathrm{d}x, $$ where $|b| \lt a$. It is often done by adopting the Weierstrass substitution $t = \tan(x/2)$. The upshot of this substitution is \begin{align*} \int \frac{1}{a + b \cos x} \, \mathrm{d}x &= \int \frac{1}{a + b\left(\frac{1-t^2}{1+t^2}\right)} \, \frac{2\,\mathrm{d}t}{1+t^2} \\ &= \int \frac{2}{(a+b) + (a-b)t^2} \, \mathrm{d}t \\ &= \frac{2}{\sqrt{a^2-b^2}} \arctan\left(\sqrt{\frac{a-b}{a+b}} \tan \frac{x}{2} \right) + \mathsf{C}. \tag{1} \end{align*} However, there is one caveat in this result. Suppose we want to compute the definite integral $$ \int_{0}^{2\pi} \frac{1}{a + b \cos x} \, \mathrm{d}x. $$ Can we utilize the antiderivative $\text{(1)}$ to find its value? Well, this is not possible, at least directly. This is because the substitution $t = \tan(x/2)$ is discontinuous at every point of the form $$ x = (2k+1)\pi, \qquad k \in \mathbb{Z}. $$ Consequently, the formula $\text{(1)}$ also suffers from discontinuity at those points. The usual trick to avoid this difficulty is to choose another $2\pi$-interval on which $\tan(x/2)$ is continuous. One such choice is the open interval $(-\pi, \pi)$. So, \begin{align*} \int_{0}^{2\pi} \frac{1}{a + b \cos x} \, \mathrm{d}x &= \int_{-\pi}^{\pi} \frac{1}{a + b \cos x} \, \mathrm{d}x \\ &= \lim_{\varepsilon \to 0^+} \int_{-\pi+\varepsilon}^{\pi-\varepsilon} \frac{1}{a + b \cos x} \, \mathrm{d}x \\ &= \lim_{\varepsilon \to 0^+} \left[ \frac{2}{\sqrt{a^2-b^2}} \arctan\left(\sqrt{\frac{a-b}{a+b}} \tan \frac{x}{2} \right) \right]_{-\pi+\varepsilon}^{\pi-\varepsilon} \\ &= \frac{2\pi}{\sqrt{a^2-b^2}}. \end{align*}

So, is this the end of the story? In fact, it turns out that we can find an antiderivative that is vaild on all of $\mathbb{R}$:

Theorem. Let $a, b \in \mathbb{R}$ satisfy $|b| \lt a$. Then \begin{align*} & \int \frac{1}{a + b\cos x} \, \mathrm{d}x \\ &= \frac{1}{\sqrt{a^2 - b^2}} \biggl[ x - 2 \arctan \biggl( \frac{b \sin x}{a + \sqrt{a^2-b^2} + b \cos x} \biggr) \biggr] + \mathsf{C}. \end{align*} This formula is valid on $\mathbb{R}$.

Proof. The Fourier series of the integrand is $$ \frac{1}{a + b\cos x} = \frac{1}{\sqrt{a^2 - b^2}} \left( 1 + 2 \sum_{n=1}^{\infty} r^n \cos(nx) \right), $$ where $r$ is given by $$ r = -\frac{b}{a + \sqrt{a^2 - b^2}} \in (-1, 1). $$ Integrating both sides, we conclude \begin{align*} & \int \frac{1}{a + b\cos x} \, \mathrm{d}x \\ &= \frac{1}{\sqrt{a^2 - b^2}} \left( x + 2 \sum_{n=1}^{\infty} \frac{r^n}{n} \sin(nx) \right) + \mathsf{C} \\ &= \frac{1}{\sqrt{a^2 - b^2}} \left[ x - 2 \operatorname{Im}\left( \log(1 - re^{ix}) \right) \right] + \mathsf{C} \\ &= \frac{1}{\sqrt{a^2 - b^2}} \left[ x + 2 \arctan \left( \frac{r \sin x}{1 - r \cos x} \right) \right] + \mathsf{C}. \end{align*} Plugging the value of $r$ into this, we are done.

Tuesday, July 19, 2022

A nasty integral

In this post, we discuss how certain integrals of rational functions can be computed.

Example 1. $$ \int_{0}^{\infty} \frac{x^{8} - 4x^{6} + 9x^{4} - 5x^{2} + 1}{x^{12} - 10x^{10} + 37x^{8} - 42x^{6} + 26x^{4} - 8x^{2} + 1} \, \mathrm{d}x = \frac{\pi}{2}. $$ This is Problem 11148 of American Mathematical Monthly, Vol.112, April 2005.

Proof. For each $f(z) \in \mathbb{C}[z]$, we define $f^*(z) = \overline{f(\bar{z})}$. This is the same as taking conjugate to the coefficients appearing in $f(z)$. Let \begin{align*} P(z) &= z^8-4 z^6+9 z^4-5 z^2+1, \\ Q(z) &= z^{12}-10 z^{10}+37 z^8-42 z^6+26 z^4-8 z^2+1. \end{align*} The key observation is that all the zeros of the polynomial $$ q(z) = z^3-(2+i) z^2-(1-i) z+1 $$ lies in the upper half-plane $\mathbb{H} = \{ z \in \mathbb{C} : \operatorname{Im}(z) \gt 0\}$, and $$ Q(z) = q(z)q^*(z)q(-z)q^*(-z) $$ is a factorization of $Q(z)$ into coprime factors. Now let $p(z)$ be the unique polynomial such that $\deg p \lt \deg q$ and $$ P(z) \equiv p(z) q^*(z)q(-z)q^*(-z) \pmod{q(z)}. $$ Such $p(z)$ exists because $q(z)$ and $q^*(z)q(-z)q^*(-z)$ are coprime. Also, such $p(z)$ can be computed using the extended Euclidean algorithm. After a tedious computation, it follows that $$ p(z) = \frac{-i z^2-(1-2 i) z+1}{4}. $$ Then by the symmetry and the uniqueness of partial fraction decomposition, it follows that $$ \frac{P(z)}{Q(z)} = \frac{p(z)}{q(z)} + \frac{p^*(z)}{q^*(z)} + \frac{p(-z)}{q(-z)} + \frac{p^*(-z)}{q^*(-z)}. $$ Moreover, using the fact that all the zeros of $q(z)$ lies in $\mathbb{H}$, we can invoke Cauchy's integral theorem to write \begin{align*} \operatorname{PV}\!\! \int_{-\infty}^{\infty} \frac{x^n}{q(x)} \, \mathrm{d}x &= \lim_{R\to\infty} \int_{-R}^{R} \frac{x^n}{q(x)} \, \mathrm{d}x \\ &= \lim_{R\to\infty} \int_{-\pi}^{0} \frac{i (Re^{i\theta})^{n+1}}{q(Re^{i\theta})} \, \mathrm{d}\theta \\ &= \begin{cases} 0, & n \lt \deg q - 1, \\ \pi i, & n = \deg q - 1. \end{cases} \end{align*} Therefore we conclude that $$ \int_{-\infty}^{\infty} \frac{P(x)}{Q(x)} \, \mathrm{d}x = 4 \operatorname{Re}\left[ \operatorname{PV}\!\! \int_{-\infty}^{\infty} \frac{p(x)}{q(x)} \, \mathrm{d}x \right] = 4 \pi i \left( [z^2] p(z) \right) = \pi. $$ The answer is half of the above integral, hence the claim follows. $\square$

Here is another example.

Example 2. $$ \int_{0}^{\infty} \frac{x^{14} - 15x^{12} + 82x^{10} - 190x^{8} + 184x^{6} - 60x^{4} + 16x^{2}}{x^{16} - 20x^{14} + 156x^{12} - 616x^{10} + 1388x^{8} - 1792x^{6} + 1152x^{4} - 224x^{2} + 16} \, \mathrm{d}x = \frac{\pi}{2}. $$

Proof. Apply the same argument as above with the choices $$ q(z) = z^4+(3-i) z^3-(1+3 i) z^2-(6+2 i) z-2 $$ and $$ p(z) = \frac{-i z^3-3 i z^2-2 i z}{4}. $$ We assure the reader that all the zeros of $q(z)$ lie in the upper half-plane, and the denominator of the integral admits the same form of factorization into coprime factors as before. $\square$

Sunday, July 10, 2022

Orthogonal decomposition using Gram determinants

Theorem. Let $\mathcal{H}$ be a Hilbert space equipped with the inner product $\langle \cdot, \cdot \rangle$, and let $w, v_1, \ldots, v_n \in \mathcal{H}$. If $$ G(w_1, \ldots, w_n) = \det [\langle w_i, w_j \rangle]_{i,j=1}^{n} $$ denotes the Gram determinant, then the followings hold true:

  1. The vector $w_{\perp}$ defined by the (formal) determinant $$ w_{\perp} = \frac{1}{G(v_1,\ldots,v_n)}\begin{vmatrix} w & v_1 & \cdots & v_n \\ \langle v_1, w \rangle & \langle v_1, v_1 \rangle & \cdots & \langle v_1, v_n \rangle \\ \vdots & \vdots & \ddots & \vdots \\ \langle v_n, w \rangle & \langle v_n, v_1 \rangle & \cdots & \langle v_n, v_n \rangle \end{vmatrix} \tag{1} $$ is orthogonal to $V = \operatorname{span}\{v_1, \ldots, v_n\}$, and in fact, $$ w = w_{\perp} + (w - w_{\perp}) $$ is the orthogonal decomposition of $w$.
  2. The square-distance between $w$ and $V$ is given by $$ \|w_{\perp}\|^2 = \operatorname{dist}(w, V)^2 = \frac{G(w, v_1, \ldots, v_n)}{G(v_1, \ldots, v_n)}. \tag{2} $$

Proof. Extend the notation by letting $$ G(\{v_i\}_{i=1}^{n}, \{w_i\}_{i=1}^{n}) = \det [\langle v_i, w_j \rangle]_{i,j=1}^{n}, $$ and then define the linear functional $\ell$ on $\mathcal{H}$ by $$ \ell(u) = G(\{u, v_1, \ldots, v_n\}, \{w, v_1, \ldots, v_n\}). $$ By the Riesz representation theorem, there exists $h \in \mathcal{H}$ such that $\ell(u) = \langle h, u \rangle$ for all $u \in \mathcal{H}$. Since $\ell(v_i) = 0$ for all $i = 1, \ldots, n$, it follows that $h$ is orthogonal to $V$. Moreover, expanding the determinant defining $\ell(u)$ along the first line, we see that \begin{align*} \ell(u) &= G(v_1, \ldots, v_n) \langle u, w \rangle \\ &\quad + \sum_{i=1}^{n} (-1)^i G(\{w, v_1, \ldots, \widehat{v_i}, \ldots, v_n\}, \{v_1, \ldots, v_n\}) \langle u, v_i \rangle, \end{align*} and so, \begin{align*} h &= G(v_1, \ldots, v_n) w \\ &\quad + \sum_{i=1}^{n} (-1)^i G(\{w, v_1, \ldots, \widehat{v_i}, \ldots, v_n\}, \{v_1, \ldots, v_n\}) v_1, \end{align*} which is precisely the formal determant in $\text{(1)}$. This also implies that $h$ is a linear combination of $w, v_1, \ldots, v_n$ with the coefficient of $w$ given by $G(v_1, \ldots, v_n)$, hence $$ w_{\perp} = \frac{h}{G(v_1, \ldots, v_n)} \in w + V. $$ This and $w_{\perp} \perp V$ together proves the first item of the theorem. For the second item, the equality $\|w_{\perp}\| = \operatorname{dist}(w, V)$ is obvious by the orthogonality. Moreover, $$ \|w_{\perp}\|^2 = \langle w_{\perp}, w_{\perp} \rangle = \langle w_{\perp}, w \rangle = \frac{\ell(w)}{G(v_1, \ldots, v_n)} = \frac{G(w, v_1, \ldots, v_n)}{G(v_1, \ldots, v_n)}. $$ This completes the proof. $\square$