11/30/2011

Why Gibbs Sampling?

Finally figure out why LDA uses gibbs sampling. Because when calculating the posterior distribution, the denominator p(w1:D), which is summing the joint distribution over every possible instantiation of the hidden topic structure, is exponentially large(David Blei, et al. A correlated topic model of Science). So an approximation of this equation can be formed by using sampling-based algorithms or variational algorithms.

There is an article for beginners:
Gibbs Sampling for the Uninitiated. written by P. Resnik, E. Hardisty

One fancifully imagines a Chinese restaurant with an infinite number of circular tables, each with infinite capacity. Customer 1 is seated at an unoccupied table with probability 1. At time n + 1, a new customer chooses uniformly at random to sit at one of the following n + 1 places: directly to the left of one of the n customers already sitting at an occupied table, or at a new, unoccupied circular table. Clearly, each table corresponds to a block of a random partition. (For an account of the custom in actual Chinese restaurants, see table sharing). It is then immediate that the probability assigned to any particular partition (ignoring the order in which customers sit around any particular table) is

\Pr(B_n = B) = \dfrac{\prod_{b\in B} (|b| -1)!}{n!}
where b is a block in the partition B and |b| is the size (i.e. number of elements) of b. The restaurant analogy is due to Pitman and Dubins, and first appears in print in [1].

No comments:

Post a Comment