The Paper FeedA feed of Bayesian network related papers, articles, books and research that we happen across and find of interest An extension to the noisyOR function to resolve the ‘explaining away’ deficiency for practical Bayesian network problems2019
The “Leaky noisyOR” is a common method used to simplify the elicitation of complex conditional probability tables in Bayesian networks involving Boolean variables. It has proven useful for approximating the required relationship in many realworld situations where there are two or more variables that are potential causes of a single effect variable. However, one of the properties of leaky noisyOR is Conditional Intercausal Independence (CII). This property means that ‘explaining away‘ behaviourone of the most powerful benefits of BN inference is not present when the effect variable is observed as false. Yet, for many realworld problems where the leaky noisyOR has been considered, this behaviour would be expected, meaning that leaky noisyOR is deficient as an approximation of the required relationship in such cases. There have been previous attempts to adapt noisyOR to resolve this problem. However, they require too many additional parameters to be elicited. We describe a simple but powerful extension to leaky noisyOR that requires only a single additional parameter. While it does not solve the CII problem in all cases, it resolves most of the explaining away deficiencies that occur in practice. The problem and solution is illustrated using an example from intelligence analysis.
Bayesian networks with a logistic regression model for the conditional probabilities2008
Logistic regression techniques can be used to restrict the conditional probabilities of a Bayesian network for discrete variables. More specifically, each variable of the network can be modeled through a logistic regression model, in which the parents of the variable define the covariates. When all main effects and interactions between the parent variables are incorporated as covariates, the conditional probabilities are estimated without restrictions, as in a traditional Bayesian network. By incorporating interaction terms up to a specific order only, the number of parameters can be drastically reduced. Furthermore, ordered logistic regression can be used when the categories of a variable are ordered, resulting in even more parsimonious models. Parameters are estimated by a modified junction tree algorithm. The approach is illustrated with the Alarm network.
Bayesian networks for static and temporal data fusion2018
Prediction and inference on temporal data is very frequently performed using time
series data alone. We believe that these tasks could benefit from leveraging the contex
tual metadata associated to time series  such as location, type, etc. Conversely, tasks
involving prediction and inference on metadata could benefit from information held
within time series. However, there exists no standard way of jointly modeling both
time series data and descriptive metadata. Moreover, metadata frequently contains
highly correlated or redundant information, and may contain errors and missing values.
We first consider the problem of learning the inherent probabilistic graphical structure
of metadata as a Bayesian Network. This has two main benefits: (i) once structured
as a graphical model, metadata is easier to use in order to improve tasks on temporal
data and (ii) the learned model enables inference tasks on metadata alone, such
as missing data imputation. However, Bayesian network structure learning is a
tremendous mathematical challenge, that involves a NPHard optimization problem.
We present a tailormade structure learning algorithm, inspired from novel theoretical
results, that exploits (quasi)determinist dependencies that are typically present in
descriptive metadata. This algorithm is tested on numerous benchmark datasets
and some industrial metadatasets containing deterministic relationships. In both
cases it proved to be significantly faster than state of the art, and even found more
performant structures on industrial data. Moreover, learned Bayesian networks are
consistently sparser and therefore more readable.
We then focus on designing a model that includes both static (meta)data and dynamic
data. Taking inspiration from state of the art probabilistic graphical models for tem
poral data (Dynamic Bayesian Networks) and from our previously described approach
for metadata modeling, we present a general methodology to jointly model metadata
and temporal data as a hybrid staticdynamic Bayesian network. We propose two
main algorithms associated to this representation: (i) a learning algorithm, which
while being optimized for industrial data, still generalizes to any task of static and
dynamic data fusion, and (ii) an inference algorithm, enabling both usual tasks on
temporal or static data alone, and tasks using the two types of data.
Finally, we discuss some of the notions introduced during the thesis, including ways
to measure the generalization performance of a Bayesian network by a score inspired
from the crossvalidation procedure from supervised machine learning. We also
propose various extensions to the algorithms and theoretical results presented in the
previous chapters, and formulate some research perspectives.
Hidden Node Detection between Observable Nodes Based on Bayesian Clustering2019
Structure learning is one of the main concerns in studies of Bayesian networks. In the present paper, we consider networks consisting of both observable and hidden nodes, and propose a method to investigate the existence of a hidden node between observable nodes, where all nodes are discrete. This corresponds to the model selection problem between the networks with and without the middle hidden node. When the network includes a hidden node, it has been known that there are singularities in the parameter space, and the Fisher information matrix is not positive definite. Then, the many conventional criteria for structure learning based on the Laplace approximation do not work. The proposed method is based on Bayesian clustering, and its asymptotic property justifies the result; the redundant labels are eliminated and the simplest structure is detected even if there are singularities.
Parsimonious graphical dependence models constructed from vines2018
Multivariate models with parsimonious dependence have been used for a large number of variables, and have mainly been developed for multivariate Gaussian. Graphical dependence model representations include Bayesian networks, conditional independence graphs, and truncated vines. The class of Gaussian truncated vines is a subset of Gaussian Bayesian networks and Gaussian conditional independence graphs, but has an extension to non‐Gaussian dependence with (i) combinations of continuous and discrete random variables with arbitrary univariate margins, and (ii) accommodation of latent variables. To illustrate the importance of graphical models with latent variables that do not rely on the Gaussian assumption, the combined factor‐vine structure is presented and applied to a data set of stock returns.
copulafactor modellatent variablesMarkov treemixture of conditional distributionsparsimonious dependenceinverse correlation matrixtheory
Posted 9 Jan 2019 ·
Link
Visual Analysis of Bayesian Networks for Electronic Health Records2018
Worldwide the amount of data generated by the medical community is staggering, and increasing dramatically. Using this data to improve patient care using analytics and machine learning is a huge and largely untapped opportunity. The most important medical data captured exist in patients' electronic health records (EHRs) which are maintained and utilized by health care providers. EHRs consist of rich and comprehensive patientspecific information from a large number of sources in different formats with heterogeneous data types. There are numerous challenges in attempting to apply existing analytic tools and methodologies to this data. Many features extracted from EHRs have dependent relationships  for example, “flu” and “high body temperature”. Bayesian networks, as one of the few modeling methodologies which capture feature dependence rather than assuming independence, provide a flexible foundation for modeling EHRs. However, existing Bayesian network learning methodologies produce models whose complexity makes them difficult for clinicians to utilize or even interpret. Therefore, better model visualization methodologies, as well as learning methods which produce models more amenable to simplification and summarization, are critical to making them interpretable and useful to clinicians, and therefore to improving patient care.
In this dissertation, I present a framework for predictive analysis of patient clinical data, from feature extraction to model analysis. I first study straightforward machine learning approaches on extracted EHR features and find that incorporating diagnosis features improves area under ROC curve (AUC) by 10% compared to a baseline. Because of the many dependencies between features extracted from EHRs, I next investigate Bayesian network models, in which my clinician collaborators have identified known and suspected high pressure ulcer risk factors. The models also substantially increase sensitivity of the prediction  nearly three times higher comparing to logistical regression models  without sacrificing overall accuracy. However, interpreting these models involves a significant cognitive burden, motivating my investigation of visual analytic techniques. To this end, I develop an interactive tool for visualizing Bayesian networks to improve clinicians’ insight and interpretation of models. I perform a user study to assess the impact of the tool and its features. The results show quantitatively that users complete tasks more efficiently when using the tool, and qualitatively that they found it useful. Bayesian networks containing natural groupings or “clusters” are better suited to visualization and summarization. Since existing Bayesian network learning methods do not naturally yield such groupings, I alter the Bayesian network learning process to learn structures which optimize not just for representing dependency relationships, but additionally and simultaneously, for clusterability measures. My results show that the augmented Bayesian network process can find structures with much larger clusterability measures, with only a small decrease in their standard scoring measure. Visualizations of learned clustered Bayesian networks show that the algorithm cohesively groups related features, making the networks easier to interpret.
Using Bayesian networks to guide the assessment of new evidence in an appeal case2016
When new forensic evidence becomes available after a conviction there is no systematic framework to help lawyers to determine whether it raises sufficient questions about the verdict in order to launch an appeal. This paper presents such a framework driven by a recent case, in which a defendant was convicted primarily on the basis of audio evidence, but where subsequent analysis of the evidence revealed additional sounds that were not considered during the trial. The framework is intended to overcome the gap between what is generally known from scientific analyses and what is hypothesized in a legal setting. It is based on Bayesian networks (BNs) which have the potential to be a structured and understandable way to evaluate the evidence in a specific case context. However, BN methods suffered a setback with regards to the use in court due to the confusing way they have been used in some legal cases in the past. To address this concern, we show the extent to which the reasoning and decisions within the particular case can be made explicit and transparent. The BN approach enables us to clearly define the relevant propositions and evidence, and uses sensitivity analysis to assess the impact of the evidence under different assumptions. The results show that such a framework is suitable to identify information that is currently missing, yet clearly crucial for a valid and complete reasoning process. Furthermore, a method is provided whereby BNs can serve as a guide to not only reason with incomplete evidence in forensic cases, but also identify very specific research questions that should be addressed to extend the evidence base and solve similar issues in the future.
Posted 3 Jan 2019 ·
Link
Evaluating the Weighted Sum Algorithm for Estimating Conditional Probabilities in Bayesian Networks2010
The primary challenge in constructing a Bayesian Network (BN) is acquiring its Conditional Probability Tables (CPTs). CPTs can be elicited from domain experts; however, they scale exponentially in size, thus making their elicitation very time consuming and costly. Das [1] proposed a solution to this problem using the weighted sum algorithm (WSA). In this paper we present two empirical studies that evaluates the WSA's efficiency and accuracy, we also describe an extension for the algorithm to deal with one of its shortcomings. Our results show that the estimates obtained using the WSA were highly accurate and make significant reductions in elicitation.
Improving the analysis of dependable systems by mapping fault trees into Bayesian networks2001
Bayesian Networks (BN) provide a robust probabilistic method of reasoning under uncertainty. They have been successfully applied in a
variety of realworld tasks but they have received little attention in the area of dependability. The present paper is aimed at exploring the
capabilities of the BN formalism in the analysis of dependable systems. To this end, the paper compares BN with one of the most popular
techniques for dependability analysis of large, safety critical systems, namely Fault Trees (FT). The paper shows that any FT can be directly
mapped into a BN and that basic inference techniques on the latter may be used to obtain classical parameters computed from the former (i.e.
reliability of the Top Event or of any subsystem, criticality of components, etc). Moreover, by using BN, some additional power can be
obtained, both at the modeling and at the analysis level. At the modeling level, several restrictive assumptions implicit in the FT methodology
can be removed and various kinds of dependencies among components can be accommodated. At the analysis level, a general diagnostic
analysis can be performed. The comparison of the two methodologies is carried out by means of a running example, taken from the literature,
that consists of a redundant multiprocessor system.
An explication of uncertain evidence in Bayesian networks: likelihood evidence and probabilistic evidence2015
This paper proposes a systematized presentation and a terminology for observations in a Bayesian network. It focuses on the three main concepts of uncertain evidence, namely likelihood evidence and fixed and notfixed probabilistic evidence, using a review of previous literature. A probabilistic finding on a variable is specified by a local probability distribution and replaces any former belief in that variable. It is said to be fixed or not fixed regarding whether it has to be kept unchanged or not after the arrival of observation on other variables. Fixed probabilistic evidence is defined by Valtorta et al. (J Approx Reason 29(1):71–106 2002) under the name soft evidence, whereas the concept of notfixed probabilistic evidence has been discussed by Chan and Darwiche (Artif Intell 163(1):67–90 2005). Both concepts have to be clearly distinguished from likelihood evidence defined by Pearl (1988), also called virtual evidence, for which evidence is specified as a likelihood ratio, that often represents the unreliability of the evidence. Since these three concepts of uncertain evidence are not widely understood, and the terms used to describe these concepts are not well established, most Bayesian networks engines do not offer well defined propagation functions to handle them. Firstly, we present a review of uncertain evidence and the proposed terminology, definitions and concepts related to the use of uncertain evidence in Bayesian networks. Then we describe updating algorithms for the propagation of uncertain evidence. Finally, we propose several results where the use of fixed or notfixed probabilistic evidence is required.
