নতুন ডেটা নিয়ে বায়েশিয়ান আপডেট করছে

17

এন ডাটা পয়েন্টগুলি পর্যবেক্ষণ করার পরে আমরা কীভাবে পূর্ববর্তী N ~ (a, b) এর সাথে একটি পশ্চাদমূহ গণনা করব? আমি ধরে নিলাম যে আমাদের নমুনার অর্থের এবং নমুনার পয়েন্টগুলির বৈকল্পিক গণনা করতে হবে এবং পূর্ববর্তী সাথে উত্তরোত্তরকে একত্রিত করে এমন এক ধরণের গণনা করতে হবে, তবে সংমিশ্র সূত্রটি দেখতে কেমন তা আমি যথেষ্ট নিশ্চিত নই।

bayesian normal-distribution conjugate-prior

— statstudent
সূত্র

23

Bayesian আপডেট মৌলিক ধারণা যে কিছু ডেটা দেওয়া হয় $X$ এবং পূর্বে সুদের উপর পরামিতি $\theta$ , যেখানে তথ্য এবং প্যারামিটার মধ্যে সম্পর্ক ব্যবহার বর্ণনা করা হয়েছে সম্ভাবনা ফাংশন, আপনি বায়েসের ব্যবহার উপপাদ্য অবর প্রাপ্ত

p (θ ∣ X) \propto p (X ∣ θ) p (θ)

$p(\theta \mid X) \propto p(X \mid \theta) \, p(\theta)$

এই ক্রমানুসারে সম্পন্ন যেতে পারে, যেখানে প্রথম ডাটা পয়েন্ট দেখার পর পূর্বে আপডেট হয়ে অবর , পরবর্তী আপনি দ্বিতীয় ডাটা পয়েন্ট নিতে পারেন এবং ব্যবহার অবর সামনে প্রাপ্ত আপনার যেমন পূর্বে আবার ইত্যাদি, এটা আপডেট করার জন্য $x_1$ $\theta$ $\theta'$ $x_2$ $\theta'$

Let me give you an example. Imagine that you want to estimate mean $\mu$ of normal distribution and $\sigma^2$ is known to you. In such case we can use normal-normal model. We assume normal prior for $\mu$ with hyperparameters $\mu_0,\sigma_0^2:$

\begin{aligned} X ∣ μ & \sim N o r m a l (μ, σ^{2}) \\ μ & \sim N o r m a l (μ_{0}, σ_{0}^{2}) \end{aligned}

$\begin{align} X\mid\mu &\sim \mathrm{Normal}(\mu,\ \sigma^2) \\ \mu &\sim \mathrm{Normal}(\mu_0,\ \sigma_0^2) \end{align}$

$\mu$

\begin{aligned} E (μ^{'} ∣ x) & = \frac{σ^{2} μ + σ_{0}^{2} x}{σ^{2} + σ_{0}^{2}} \\ V a r (μ^{'} ∣ x) & = \frac{σ^{2} σ_{0}^{2}}{σ^{2} + σ_{0}^{2}} \end{aligned}

$\begin{align} E(\mu' \mid x) &= \frac{\sigma^2\mu + \sigma^2_0 x}{\sigma^2 + \sigma^2_0} \\[7pt] \mathrm{Var}(\mu' \mid x) &= \frac{\sigma^2 \sigma^2_0}{\sigma^2 + \sigma^2_0} \end{align}$

Unfortunately, such simple closed-form solutions are not available for more sophisticated problems and you have to rely on optimization algorithms (for point estimates using maximum a posteriori approach), or MCMC simulation.

Below you can see data example:

n <- 1000
set.seed(123)
x     <- rnorm(n, 1.4, 2.7)
mu    <- numeric(n)
sigma <- numeric(n)

mu[1]    <- (10000*x[i] + (2.7^2)*0)/(10000+2.7^2)
sigma[1] <- (10000*2.7^2)/(10000+2.7^2)
for (i in 2:n) {
  mu[i]    <- ( sigma[i-1]*x[i] + (2.7^2)*mu[i-1] )/(sigma[i-1]+2.7^2)
  sigma[i] <- ( sigma[i-1]*2.7^2                  )/(sigma[i-1]+2.7^2)
}

If you plot the results, you'll see how posterior approaches the estimated value (it's true value is marked by red line) as new data is accumulated.

For learning more you can check those slides and Conjugate Bayesian analysis of the Gaussian distribution paper by Kevin P. Murphy. Check also Do Bayesian priors become irrelevant with large sample size? You can also check those notes and this blog entry for accessible step-by-step introduction to Bayesian inference.

— Tim
সূত্র

Thank you, this is very helpful. How would we go about solving this simple example (unknown variance, unlike your example)? Suppose we have a prior distribution of N~(5, 4) and then we observe 5 data points (8, 9, 10, 8, 7). What would be the posterior after these observations? Thank you in advance. Much appreciated.

— statstudent

@Kelly you can find examples for cases when either variance is unknown and mean known, or both are unknown in the Wikipedia entry on conjugate priors and the links I provided in the end of my answer. If both mean and variance are unknown it becomes slightly more complicated.

— Tim

@Kelly btw, you can check here for example of estimating both

μ

$\mu$ and

σ^{2}

$\sigma^2$ .

— Tim

4

If you have a prior $P(\theta)$ and a likelihood function $P(x \mid \theta)$ you can calculate the posterior with:

P (θ ∣ x) = \frac{\sum_{θ} P (x ∣ θ) P (θ)}{P (x)}

$P(\theta \mid x) = \frac{\sum_\theta P(x \mid \theta) P(\theta)}{P(x)}$

Since $P(x)$ is just a normalization constant to make probabilities sum to one, you could write:

P (θ ∣ x) \sim \sum_{θ} P (x ∣ θ) P (θ)

$P(\theta \mid x) \sim \sum_\theta P(x \mid \theta)P(\theta)$

Where $\sim$ means "is proportional to."

The case of conjugate priors (where you often get nice closed form formulas)

This Wikipedia article on conjugate priors may be informative. Let $\boldsymbol{\theta}$ be a vector of your parameters. Let $P(\boldsymbol{\theta})$ be a prior over your parameters. Let $P(\mathbf{x} \mid \boldsymbol{\theta})$ be the likelihood function, the probability of the data given the parameters. The prior is a conjugate prior for the likelihood function if the prior $P(\boldsymbol{\theta})$ and the posterior $P(\boldsymbol{\theta} \mid \mathbf{x})$ are in the same family (eg. both Gaussian).

The table of conjugate distributions may help build some intuition (and also give some instructive examples to work through yourself).

— Matthew Gunn
সূত্র

1

This is the central computation issue for Bayesian data analysis. It really depends on the data and distributions involved. For simple cases where everything can be expressed in closed form (e.g., with conjugate priors), you can use Bayes's theorem directly. The most popular family of techniques for more complex cases is Markov chain Monte Carlo. For details, see any introductory textbook on Bayesian data analysis.

— Kodiologist
সূত্র

Thank you so much! Sorry if this is a really stupid follow-up question, but in the simple cases that you mentioned, how exactly would we use Bayes's theorem directly? Would the distribution created by the sample mean and variance of the data points become the likelihood function? Thank you very much.

— statstudent

@Kelly Again, it depends on the distribution. See e.g. en.wikipedia.org/wiki/Conjugate_prior#Example . (If I answered your question, don't forget to accept my answer by clicking on the check mark under the voting arrows.)

— Kodiologist