UAI 9th Bayesian Modeling Applications Workshop

— Ann Nicholson (Workshop co-chair)
When: Saturday, August 18, 2012
Where: Catalina Island, California

This single day workshop is an excellent forum for presenting and hearing about real-world applications of Bayesian networks. It follows the 28th Int. Conference on Uncertainty in AI, the premier conference for presentation of research on Bayesian technology (Aug 15-17th). The call for papers is now out, with submission deadline May 5th (with a week’s extension very likely!).

The aim of the workshop is to foster discussion and interchange about novel contributions that can speak to both the academic and the larger research community. Accordingly, we seek submissions also from practitioners and tool developers as well as researchers. We welcome submissions describing real world applications, whether as stand-alone BNs or where the BNs are embedded in a larger software system. We encourage authors to address the practical issues involved in developing real-world applications, such as knowledge engineering methodologies, elicitation techniques, defining and meeting client needs, validation processes and integration methods, as well as software tools, including visualization and user interaction techniques to support these activities.

We particularly encourage the submission of papers that address the workshop theme of temporal modeling. Recently communities building dynamic Bayes networks (DBNs) and partially observable MDPs (POMDPs) are coming to realize that they are applying their methods to identical applications. Similarly POMDPs and other probabilistic methods are now established in the field of Automated Planning. Stochastic process models such as continuous time Bayes networks (CTBNs) should also be considered as part of this trend. Adaptive and on-line learning models also fit into this focus.

Ann Nicholson (Workshop co-chair)

Minimum Message Length: A Computational Bayesianism

—Lloyd Allison

Minimum message length (MML) inference is a computational implementation of Bayesian inference, an information-theoretic means of finding high posterior probability hypotheses, devised by Chris Wallace and David Boulton around 1968 (see Wallace's history of MML). MML seeks to minimise a two-part message length I(h,e) = I(e|h) + I(h), where h encodes a hypothesis and e some relevant evidence (data). So long as coding follows the principles developed by Claude Shannon, so that the codes enforce the efficiency principle that message lengths I(h) = - \log P(h), then minimising the MML message length is trivially equivalent to maximising posterior probability:

  • I(h,e) = I(e|h) + I(h)
  • - \log P(h,e) = - \log P(e|h) - \log P(h)
  • \log P(h,e) = \log [P(e|h)P(h)]
  • P(h,e) = P(e|h)P(h)

Since during this sequence we have multiplied by -1, we have also switched from minimising a message length to maximising a probability. And at the end, since P(h,e) and P(h|e) differ only by a positive multiple (see Bayes' theorem), maximising one is the same as maximising the other.

This foundation for minimum message length inference is quite elementary, so the fact that it was not in use before 1968 may be a little surprising. It is probably partly due to limits on computational capacity inhibiting Bayesian statistics and the related dominance of frequentist methods. That there remains any debate about computational Bayesianism, however, is even more surprising.

The application of Bayes' theorem is straightforward for discrete (multinomial) variables governed by a probability function. But consider a problem in which one or more variables are continuous, rather than discrete. Can Bayes' theorem apply?

  • Any continuous attribute (variable) can only be measured to some limited accuracy, \pm \epsilon /2, \epsilon > 0.
  • So, every datum that is possible under a model (theory, hypothesis) has a probability that is strictly greater than zero, and not just a probability density.
  • Any continuous parameter of a model can only be inferred (estimated) to some limited precision\pm \delta /2, \delta > 0.
  • So, every parameter estimate that is possible under a prior has a probability that is strictly greater than zero, and not just a probability density.
  • So, in continuous empirical domains both the data and the model spaces have a natural discretisation and Bayes' theorem can always be applied.

However, this is not to say that it is easy to go and make MML work in any given application; in fact it can be quite difficult. After the self evident observations above, a lot of hard work on efficient encodings, search algorithms, code books, invariance, Fisher information, fast approximations, robust heuristics, adaptations to specific problems, and all the rest, remained to be done. Fortunately, MML has been made to work in many general and useful applications, including, but not limited to:

For further information on MML you can peruse my MML web pages. For Chris Wallace's own account of MML see his book Inductive Inference by Minimum Message Length.