How to Apply the Central Limit Theorem to Constrained Data | by Ryan Burn | Dec, 2024

Editor
1 Min Read


What can we say about the mean of data distributed in an interval [a, b]?

Let’s imagine that we’re measuring the approval rating of an unpopular politician. Suppose we sample ten polls and get the values

How can we construct a posterior distribution for our belief in the politician’s mean approval rating?

Let’s assume that the polls are independent and identically distributed random variables, X_1, …, X_n. The central limit theorem tells us that the sample mean will asymptotically approach a normal distribution with variance σ²/n

where μ and σ² are the mean and variance of X_i.

Figure 1: Plots of a normalized histogram of sample approval means for our unpopular politician together with the normal distribution approximation for n=1, n=3, n=5, n=7, n=10, and n=20. We can see that by n=10, the sample mean distribution is quite close to its normal approximation. Figure by author.

Motivated by this asymptotic limit, let’s approximate the likelihood of observed data y with

Using the objective prior

(more on this later) and integrating out σ² gives us a t distribution for the posterior, π(µ|y)

where

Let’s look at the posterior distribution for the data in Table 1.

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.