The meaning of Bayesian update
up vote
5
down vote
favorite
I'm new in Bayesian inference and I can't found the answer to this:
In real life scenario people use MCMC to compute the posterior distribution given the likelihood and the prior. Analytical solutions are not possibles. Bayesian people often say "we update our prior believe given some data to have the posterior". But something is not ok to me here: the posterior is never in the same form than the prior, right ? Unless you have a conjugate prior, which is really rare.
So what does it mean ? Your prior is a gamma distribution, and you end-up with a posterior with a completely different shape. Did you really update the prior distribution ? certainly no. We can't compare apples and oranges.
Does it means that we had a prior belief with a certain shape (gamma distribution), then we update this belief so we have a new shape (not even described analytically) as the output of the MCMC.
I'm very confused with this idea of "Bayesian update", because in practice, if you end-up with a complete new kind of distribution for the posterior, you can't reuse it as a new prior for the next batch of data, right ? So it means that this is just "a one shot update" of the prior belief.
It seems to me that the Bayesian update mean you update your belief in the sense of you change the prior distribution to something else. It's like saying "I've changed my mind, it is no longer a gamma distribution".
On the other hand when I follow some lectures, they never say that. They talk about Bayesian update related to the use of conjugate prior. In this case the math are nice so the posterior can be used as a prior. But this never happen in real life, right ? I mean, you don't use MCMC if you know that the posterior will be of the same family of the prior ?
bayesian mcmc conjugate-prior
add a comment |
up vote
5
down vote
favorite
I'm new in Bayesian inference and I can't found the answer to this:
In real life scenario people use MCMC to compute the posterior distribution given the likelihood and the prior. Analytical solutions are not possibles. Bayesian people often say "we update our prior believe given some data to have the posterior". But something is not ok to me here: the posterior is never in the same form than the prior, right ? Unless you have a conjugate prior, which is really rare.
So what does it mean ? Your prior is a gamma distribution, and you end-up with a posterior with a completely different shape. Did you really update the prior distribution ? certainly no. We can't compare apples and oranges.
Does it means that we had a prior belief with a certain shape (gamma distribution), then we update this belief so we have a new shape (not even described analytically) as the output of the MCMC.
I'm very confused with this idea of "Bayesian update", because in practice, if you end-up with a complete new kind of distribution for the posterior, you can't reuse it as a new prior for the next batch of data, right ? So it means that this is just "a one shot update" of the prior belief.
It seems to me that the Bayesian update mean you update your belief in the sense of you change the prior distribution to something else. It's like saying "I've changed my mind, it is no longer a gamma distribution".
On the other hand when I follow some lectures, they never say that. They talk about Bayesian update related to the use of conjugate prior. In this case the math are nice so the posterior can be used as a prior. But this never happen in real life, right ? I mean, you don't use MCMC if you know that the posterior will be of the same family of the prior ?
bayesian mcmc conjugate-prior
add a comment |
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I'm new in Bayesian inference and I can't found the answer to this:
In real life scenario people use MCMC to compute the posterior distribution given the likelihood and the prior. Analytical solutions are not possibles. Bayesian people often say "we update our prior believe given some data to have the posterior". But something is not ok to me here: the posterior is never in the same form than the prior, right ? Unless you have a conjugate prior, which is really rare.
So what does it mean ? Your prior is a gamma distribution, and you end-up with a posterior with a completely different shape. Did you really update the prior distribution ? certainly no. We can't compare apples and oranges.
Does it means that we had a prior belief with a certain shape (gamma distribution), then we update this belief so we have a new shape (not even described analytically) as the output of the MCMC.
I'm very confused with this idea of "Bayesian update", because in practice, if you end-up with a complete new kind of distribution for the posterior, you can't reuse it as a new prior for the next batch of data, right ? So it means that this is just "a one shot update" of the prior belief.
It seems to me that the Bayesian update mean you update your belief in the sense of you change the prior distribution to something else. It's like saying "I've changed my mind, it is no longer a gamma distribution".
On the other hand when I follow some lectures, they never say that. They talk about Bayesian update related to the use of conjugate prior. In this case the math are nice so the posterior can be used as a prior. But this never happen in real life, right ? I mean, you don't use MCMC if you know that the posterior will be of the same family of the prior ?
bayesian mcmc conjugate-prior
I'm new in Bayesian inference and I can't found the answer to this:
In real life scenario people use MCMC to compute the posterior distribution given the likelihood and the prior. Analytical solutions are not possibles. Bayesian people often say "we update our prior believe given some data to have the posterior". But something is not ok to me here: the posterior is never in the same form than the prior, right ? Unless you have a conjugate prior, which is really rare.
So what does it mean ? Your prior is a gamma distribution, and you end-up with a posterior with a completely different shape. Did you really update the prior distribution ? certainly no. We can't compare apples and oranges.
Does it means that we had a prior belief with a certain shape (gamma distribution), then we update this belief so we have a new shape (not even described analytically) as the output of the MCMC.
I'm very confused with this idea of "Bayesian update", because in practice, if you end-up with a complete new kind of distribution for the posterior, you can't reuse it as a new prior for the next batch of data, right ? So it means that this is just "a one shot update" of the prior belief.
It seems to me that the Bayesian update mean you update your belief in the sense of you change the prior distribution to something else. It's like saying "I've changed my mind, it is no longer a gamma distribution".
On the other hand when I follow some lectures, they never say that. They talk about Bayesian update related to the use of conjugate prior. In this case the math are nice so the posterior can be used as a prior. But this never happen in real life, right ? I mean, you don't use MCMC if you know that the posterior will be of the same family of the prior ?
bayesian mcmc conjugate-prior
bayesian mcmc conjugate-prior
asked Nov 10 at 10:58
Mike Dudley
285
285
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
up vote
2
down vote
accepted
Posteriors and priors are all distributions on the parameter space, so they can be compared as such, even if they are not of the same shape. If you are interested in performing multiple updates, i.e. to go from a posterior p(theta|Y1) to a second posterior p(theta|Y1,Y2), etc, then you can absolutely use p(theta|Y1) as a prior, and p(Y2|Y1,theta) as a likelihood. In fact some methods called sequential Monte Carlo can be used to recursively approximate a sequence of such posteriors, see for instance Chopin 2002, A sequential particle filter method for static models
, https://academic.oup.com/biomet/article-abstract/89/3/539/251804?redirectedFrom=PDF
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
1
It's a tough world!
– Pierrot
Nov 10 at 16:47
|
show 1 more comment
up vote
4
down vote
Pierrot's answer is correct, but as this seems to be a question about intuition, I wanted to give what might be a more intuitive approach to thinking about the question.
I (+1)'d this question because it is in a way insightful; you are taking seriously that you need to understand what a method really means. However, when you take a higher-level view to the meaning behind the MCMC method, you also need to take a higher-level view to what it means to update the prior. If your prior belief can be described as a gamma distribution and your posterior belief can't, you have certainly updated your prior beliefs, and this does not bring any problems of "compar[ing] apples and oranges" (with respect to saying whether or not you've updated your prior). If you assign a different probability to various events in your posterior belief to the probabilities you assign in your prior belief, you have updated your beliefs, and this has nothing to do with whether your prior and posterior are the same type of probability distribution with just different parameters, or are represented by different types of distribution entirely.
In other words, a Bayesian update does not require the posterior distribution to be of the same form of the prior. This fact does not mean that only a "one-shot" update is possible either.
Let me give a simple example. Suppose your prior belief is that it is just as likely on any particular day that it will rain (call this event $B$) than that it will not. Then your prior is a discrete uniform distribution assigning probability $0.5$ to $B$ and $0.5$ to $neg B$. However, suppose you have also observe that it is April (call this event $A$) and you know both that the probability that a day is in April is $frac112$ and that the joint probability that it will rain on a day and that the day is a day in April is $frac348$. Then
$$
Pr(B|A) = fracPr(A cap B)Pr(A) = frac3/481/12 = 0.75
$$
Now your posterior belief assigns probability $0.75$ to $B$ and $0.25$ to $neg B$, which is not a uniform distribution, but we would still validly call this a Bayesian update. Moreover, this need not be a one-shot update, because you could easily further update your belief about $B$ if, for example, you observed dark clouds in the sky, etc.
This answer does not address another question that might be lurking within the question you asked -- how you use the posterior approximation from MCMC as a prior in another update. Perhaps Pierrot's answer can give you some insight there.
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
1
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Posteriors and priors are all distributions on the parameter space, so they can be compared as such, even if they are not of the same shape. If you are interested in performing multiple updates, i.e. to go from a posterior p(theta|Y1) to a second posterior p(theta|Y1,Y2), etc, then you can absolutely use p(theta|Y1) as a prior, and p(Y2|Y1,theta) as a likelihood. In fact some methods called sequential Monte Carlo can be used to recursively approximate a sequence of such posteriors, see for instance Chopin 2002, A sequential particle filter method for static models
, https://academic.oup.com/biomet/article-abstract/89/3/539/251804?redirectedFrom=PDF
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
1
It's a tough world!
– Pierrot
Nov 10 at 16:47
|
show 1 more comment
up vote
2
down vote
accepted
Posteriors and priors are all distributions on the parameter space, so they can be compared as such, even if they are not of the same shape. If you are interested in performing multiple updates, i.e. to go from a posterior p(theta|Y1) to a second posterior p(theta|Y1,Y2), etc, then you can absolutely use p(theta|Y1) as a prior, and p(Y2|Y1,theta) as a likelihood. In fact some methods called sequential Monte Carlo can be used to recursively approximate a sequence of such posteriors, see for instance Chopin 2002, A sequential particle filter method for static models
, https://academic.oup.com/biomet/article-abstract/89/3/539/251804?redirectedFrom=PDF
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
1
It's a tough world!
– Pierrot
Nov 10 at 16:47
|
show 1 more comment
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Posteriors and priors are all distributions on the parameter space, so they can be compared as such, even if they are not of the same shape. If you are interested in performing multiple updates, i.e. to go from a posterior p(theta|Y1) to a second posterior p(theta|Y1,Y2), etc, then you can absolutely use p(theta|Y1) as a prior, and p(Y2|Y1,theta) as a likelihood. In fact some methods called sequential Monte Carlo can be used to recursively approximate a sequence of such posteriors, see for instance Chopin 2002, A sequential particle filter method for static models
, https://academic.oup.com/biomet/article-abstract/89/3/539/251804?redirectedFrom=PDF
Posteriors and priors are all distributions on the parameter space, so they can be compared as such, even if they are not of the same shape. If you are interested in performing multiple updates, i.e. to go from a posterior p(theta|Y1) to a second posterior p(theta|Y1,Y2), etc, then you can absolutely use p(theta|Y1) as a prior, and p(Y2|Y1,theta) as a likelihood. In fact some methods called sequential Monte Carlo can be used to recursively approximate a sequence of such posteriors, see for instance Chopin 2002, A sequential particle filter method for static models
, https://academic.oup.com/biomet/article-abstract/89/3/539/251804?redirectedFrom=PDF
answered Nov 10 at 13:41
Pierrot
1313
1313
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
1
It's a tough world!
– Pierrot
Nov 10 at 16:47
|
show 1 more comment
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
1
It's a tough world!
– Pierrot
Nov 10 at 16:47
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
it seems to me that sequential Bayesian update, especially sequential Monte Carlo is not an easy topic, not mainstream I would say. right ? What I'm trying to evaluate here, is if Bayesian update is something practical for beginners, let say engineers that know the basics of machine learning. It seems to me that the most simple thing to do is to acquire more data, but keep using the initial prior. It's like re-start form the beginning, running MCMC from scratch. At each step, the data is bigger (then the prior vanish)
– Mike Dudley
Nov 10 at 15:13
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
But I don't like that, because all the previous computation is not used.I'm afraid that if I want to reuse the previous posterior as new prior, I will get into very sophisticated topics that are way beyond my level (let say the level of a engineer that practice ML, instead of the level of a PhD data scientist)
– Mike Dudley
Nov 10 at 15:31
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
If you want to know how to update Monte Carlo approximations of posterior distributions sequentially, you will definitely benefit from reading the paper. Understanding the algorithm is not that difficult. You can also read Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. "Sequential monte carlo samplers." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68.3 (2006): 411-436, cited more than 1000 times, but that one is perhaps harder to read.
– Pierrot
Nov 10 at 16:21
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
Thanks Pierrot, I get that as a response. But as I said, I have a kind of disappointment about the Bayesian update: it is a special case, I need to read papers and implement that myself unfortunately.
– Mike Dudley
Nov 10 at 16:37
1
1
It's a tough world!
– Pierrot
Nov 10 at 16:47
It's a tough world!
– Pierrot
Nov 10 at 16:47
|
show 1 more comment
up vote
4
down vote
Pierrot's answer is correct, but as this seems to be a question about intuition, I wanted to give what might be a more intuitive approach to thinking about the question.
I (+1)'d this question because it is in a way insightful; you are taking seriously that you need to understand what a method really means. However, when you take a higher-level view to the meaning behind the MCMC method, you also need to take a higher-level view to what it means to update the prior. If your prior belief can be described as a gamma distribution and your posterior belief can't, you have certainly updated your prior beliefs, and this does not bring any problems of "compar[ing] apples and oranges" (with respect to saying whether or not you've updated your prior). If you assign a different probability to various events in your posterior belief to the probabilities you assign in your prior belief, you have updated your beliefs, and this has nothing to do with whether your prior and posterior are the same type of probability distribution with just different parameters, or are represented by different types of distribution entirely.
In other words, a Bayesian update does not require the posterior distribution to be of the same form of the prior. This fact does not mean that only a "one-shot" update is possible either.
Let me give a simple example. Suppose your prior belief is that it is just as likely on any particular day that it will rain (call this event $B$) than that it will not. Then your prior is a discrete uniform distribution assigning probability $0.5$ to $B$ and $0.5$ to $neg B$. However, suppose you have also observe that it is April (call this event $A$) and you know both that the probability that a day is in April is $frac112$ and that the joint probability that it will rain on a day and that the day is a day in April is $frac348$. Then
$$
Pr(B|A) = fracPr(A cap B)Pr(A) = frac3/481/12 = 0.75
$$
Now your posterior belief assigns probability $0.75$ to $B$ and $0.25$ to $neg B$, which is not a uniform distribution, but we would still validly call this a Bayesian update. Moreover, this need not be a one-shot update, because you could easily further update your belief about $B$ if, for example, you observed dark clouds in the sky, etc.
This answer does not address another question that might be lurking within the question you asked -- how you use the posterior approximation from MCMC as a prior in another update. Perhaps Pierrot's answer can give you some insight there.
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
1
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
add a comment |
up vote
4
down vote
Pierrot's answer is correct, but as this seems to be a question about intuition, I wanted to give what might be a more intuitive approach to thinking about the question.
I (+1)'d this question because it is in a way insightful; you are taking seriously that you need to understand what a method really means. However, when you take a higher-level view to the meaning behind the MCMC method, you also need to take a higher-level view to what it means to update the prior. If your prior belief can be described as a gamma distribution and your posterior belief can't, you have certainly updated your prior beliefs, and this does not bring any problems of "compar[ing] apples and oranges" (with respect to saying whether or not you've updated your prior). If you assign a different probability to various events in your posterior belief to the probabilities you assign in your prior belief, you have updated your beliefs, and this has nothing to do with whether your prior and posterior are the same type of probability distribution with just different parameters, or are represented by different types of distribution entirely.
In other words, a Bayesian update does not require the posterior distribution to be of the same form of the prior. This fact does not mean that only a "one-shot" update is possible either.
Let me give a simple example. Suppose your prior belief is that it is just as likely on any particular day that it will rain (call this event $B$) than that it will not. Then your prior is a discrete uniform distribution assigning probability $0.5$ to $B$ and $0.5$ to $neg B$. However, suppose you have also observe that it is April (call this event $A$) and you know both that the probability that a day is in April is $frac112$ and that the joint probability that it will rain on a day and that the day is a day in April is $frac348$. Then
$$
Pr(B|A) = fracPr(A cap B)Pr(A) = frac3/481/12 = 0.75
$$
Now your posterior belief assigns probability $0.75$ to $B$ and $0.25$ to $neg B$, which is not a uniform distribution, but we would still validly call this a Bayesian update. Moreover, this need not be a one-shot update, because you could easily further update your belief about $B$ if, for example, you observed dark clouds in the sky, etc.
This answer does not address another question that might be lurking within the question you asked -- how you use the posterior approximation from MCMC as a prior in another update. Perhaps Pierrot's answer can give you some insight there.
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
1
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
add a comment |
up vote
4
down vote
up vote
4
down vote
Pierrot's answer is correct, but as this seems to be a question about intuition, I wanted to give what might be a more intuitive approach to thinking about the question.
I (+1)'d this question because it is in a way insightful; you are taking seriously that you need to understand what a method really means. However, when you take a higher-level view to the meaning behind the MCMC method, you also need to take a higher-level view to what it means to update the prior. If your prior belief can be described as a gamma distribution and your posterior belief can't, you have certainly updated your prior beliefs, and this does not bring any problems of "compar[ing] apples and oranges" (with respect to saying whether or not you've updated your prior). If you assign a different probability to various events in your posterior belief to the probabilities you assign in your prior belief, you have updated your beliefs, and this has nothing to do with whether your prior and posterior are the same type of probability distribution with just different parameters, or are represented by different types of distribution entirely.
In other words, a Bayesian update does not require the posterior distribution to be of the same form of the prior. This fact does not mean that only a "one-shot" update is possible either.
Let me give a simple example. Suppose your prior belief is that it is just as likely on any particular day that it will rain (call this event $B$) than that it will not. Then your prior is a discrete uniform distribution assigning probability $0.5$ to $B$ and $0.5$ to $neg B$. However, suppose you have also observe that it is April (call this event $A$) and you know both that the probability that a day is in April is $frac112$ and that the joint probability that it will rain on a day and that the day is a day in April is $frac348$. Then
$$
Pr(B|A) = fracPr(A cap B)Pr(A) = frac3/481/12 = 0.75
$$
Now your posterior belief assigns probability $0.75$ to $B$ and $0.25$ to $neg B$, which is not a uniform distribution, but we would still validly call this a Bayesian update. Moreover, this need not be a one-shot update, because you could easily further update your belief about $B$ if, for example, you observed dark clouds in the sky, etc.
This answer does not address another question that might be lurking within the question you asked -- how you use the posterior approximation from MCMC as a prior in another update. Perhaps Pierrot's answer can give you some insight there.
Pierrot's answer is correct, but as this seems to be a question about intuition, I wanted to give what might be a more intuitive approach to thinking about the question.
I (+1)'d this question because it is in a way insightful; you are taking seriously that you need to understand what a method really means. However, when you take a higher-level view to the meaning behind the MCMC method, you also need to take a higher-level view to what it means to update the prior. If your prior belief can be described as a gamma distribution and your posterior belief can't, you have certainly updated your prior beliefs, and this does not bring any problems of "compar[ing] apples and oranges" (with respect to saying whether or not you've updated your prior). If you assign a different probability to various events in your posterior belief to the probabilities you assign in your prior belief, you have updated your beliefs, and this has nothing to do with whether your prior and posterior are the same type of probability distribution with just different parameters, or are represented by different types of distribution entirely.
In other words, a Bayesian update does not require the posterior distribution to be of the same form of the prior. This fact does not mean that only a "one-shot" update is possible either.
Let me give a simple example. Suppose your prior belief is that it is just as likely on any particular day that it will rain (call this event $B$) than that it will not. Then your prior is a discrete uniform distribution assigning probability $0.5$ to $B$ and $0.5$ to $neg B$. However, suppose you have also observe that it is April (call this event $A$) and you know both that the probability that a day is in April is $frac112$ and that the joint probability that it will rain on a day and that the day is a day in April is $frac348$. Then
$$
Pr(B|A) = fracPr(A cap B)Pr(A) = frac3/481/12 = 0.75
$$
Now your posterior belief assigns probability $0.75$ to $B$ and $0.25$ to $neg B$, which is not a uniform distribution, but we would still validly call this a Bayesian update. Moreover, this need not be a one-shot update, because you could easily further update your belief about $B$ if, for example, you observed dark clouds in the sky, etc.
This answer does not address another question that might be lurking within the question you asked -- how you use the posterior approximation from MCMC as a prior in another update. Perhaps Pierrot's answer can give you some insight there.
edited Nov 10 at 13:55
answered Nov 10 at 13:43
duckmayr
1606
1606
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
1
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
add a comment |
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
1
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
Your example is not a real life example. In real life you run a MCMC that give you an approximation of B|A, you don't have B|A in fact, you have the Expectation of it (correct me if I'm wrong, I'm not really confident here). This is exactly what worries me: how can you possibly reuse this Expectation as a prior. Both theoretically and technically. Here your posterior is not an expectation, you computed it exactly. May be what I'm looking for is an implementation of MCMC that is able to define a prior with a previous run.
– Mike Dudley
Nov 10 at 15:28
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
It seems to me that such thing does not exist yet: we have to define a prior with a close form in a probabilistic language (STAN or whatever...). So this inevitably lead us to a "one shot" run. This is a pity because we end-up doing like frequentists: an one shot optimization whereas the Bayesian framework is intended to implement a kind of loop-back where the prior gets better and better through time.
– Mike Dudley
Nov 10 at 15:29
1
1
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
@MikeDudley To be precise, what we do in MCMC sampling is approximate a function proportional to Pr(B|A). While it's true that we don't have Pr(B|A), we have a very close (assuming enough draws) approximation of it up to a normalizing constant. To me, it seemed your question was more about how we can justify conceptualizing MCMC output as the update of a prior, which is what I addressed. If your main objective is not that, but a technical implementation of using MCMC draws to define the prior in a second MCMC run, a good start in the right direction might be the resource identified by Pierrot.
– duckmayr
Nov 10 at 15:39
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f376311%2fthe-meaning-of-bayesian-update%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown