The Paper Feed
A feed of Bayesian network related papers, articles, books and research that we happen across and find of interest
An Application of Dynamic Bayesian Networks to Condition Monitoring and Fault Prediction in a Sensored System: a Case Study
Bayesian networks have been widely used for classification problems. These models, structure of the network and/or its parameters (probability distributions), are usually built from a data set. Sometimes we do not have information about all the possible values of the class variable, e.g. data about a reactor failure in a nuclear power station. This problem is usually focused as an anomaly detection problem. Based on this idea, we have designed a decision support system tool of general purpose.
Factors influencing sedentary behaviour: A system based analysis using Bayesian networks within DEDIPAC
Decreasing sedentary behaviour (SB) has emerged as a public health priority since prolonged sitting increases the risk of non-communicable diseases. Mostly, the independent association of factors with SB has been investigated, although lifestyle behaviours are conditioned by interdependent factors. Within the DEDIPAC Knowledge Hub, a system of sedentary behaviours (SOS)-framework was created to take interdependency among multiple factors into account. The SOS framework is based on a system approach and was developed by combining evidence synthesis and expert consensus. The present study conducted a Bayesian network analysis to investigate and map the interdependencies between factors associated with SB through the life-course from large scale empirical data.
A template for constructing Bayesian networks in forensic biology cases when considering activity level propositions
The hierarchy of propositions has been accepted amongst the forensic science community for some time. It is also accepted that the higher up the hierarchy the propositions are, against which the scientist are competent to evaluate their results, the more directly useful the testimony will be to the court. Because each case represents a unique set of circumstances and findings, it is difficult to come up with a standard structure for evaluation. One common tool that assists in this task is Bayesian networks (BNs). There is much diversity in the way that BN can be constructed. In this work, we develop a template for BN construction that allows sufficient flexibility to address most cases, but enough commonality and structure that the flow of information in the BN is readily recognised at a glance. We provide seven steps that can be used to construct BNs within this structure and demonstrate how they can be applied, using a case example.
Bayesian networks for the interpretation of biological evidence
In court, it is typical for biological evidence to be reported at a level that only addresses how likely the DNA evidence is if it originated from a particular individual, or individuals. However, there are other questions that could be considered that would be of value in enabling the court, including the jury, to make better informed decisions. For example, although answers to specific questions such as: “Which type of bodily fluid has the DNA originated from?” or, “How was the DNA deposited at the scene?” would be probabilistic in nature, they can be crucial to the outcome of a case. The relationship between the DNA evidence, the source of the DNA and the activity that took place is described in a term called the “hierarchy of propositions.” Currently, such questions are usually answered by scientists subjectively with little to no logical framework to assist them. Bayesian networks have proven to be beneficial in providing logical reasoning by way of a likelihood ratio to help combine subjective, yet, experience‐based, opinions of experts with experimental data when answering questions which can be both complex and uncertain. These networks offer a framework that provides balance, transparency, and robustness in the evaluation of evidence. A current limitation of the use of Bayesian networks includes a lack of understanding of the underlying concepts from both forensic scientists and the courts and consequently a reduced recognition of the potential strengths.
Exploring the utility of Bayesian Networks for modelling cultural ecosystem services: A canoeing case study
Modelling cultural ecosystem services is challenging as they often involve subjective and intangible concepts. As a consequence they have been neglected in ecosystem service studies, something that needs remedying if environmental decision making is to be truly holistic. We suggest Bayesian Networks (BNs) have a number of qualities that may make them well-suited for dealing with cultural services. For example, they define relationships between variables probabilistically, enabling conceptual and physical variables to be linked, and therefore the numerical representation of stakeholder opinions. We assess whether BNs are a good method for modelling cultural services by building one collaboratively with canoeists to predict how the subjective concepts of fun and danger are impacted on by weir modification. The BN successfully captured the relationships between the variables, with model output being broadly consistent with verbal descriptions by the canoeists. There were however a number of discrepancies indicating imperfect knowledge capture. This is likely due to the structure of the network and the abstract and laborious nature of the probability elicitation stage. New techniques should be developed to increase the intuitiveness and efficiency of probability elicitation. The limitations we identified with BNs are avoided if their structure can be kept simple, and it is in such circumstances that BNs can offer a good method for modelling cultural ecosystem services.
Probabilistic Glycemic Control Decision Support In ICU: Proof Of Concept Using Bayesian Network
Glycemic control in intensive care patients is complex in terms of patients’ response to care and treatment. The variability and the search for improved insulin therapy outcomes have led to the use of human physiology model based on per-patient metabolic condition to provide personalized automated recommendations. One of the most promising solutions for this is the STAR protocol, which is based on a clinically validated insulin-nutrition-glucose physiological model. However, this approach does not consider demographical background such as age, weight, height, and ethnicity. This article presents the extension to intensive care personalized solution by integrating per-patient demographical, and upon admission information to intensive care conditions to automate decision support for clinical staff. In this context, a virtual study was conducted on 210 retrospectives intensive care patients’ data. To provide a ground, the integration concept is presented roughly, but the details are given in terms of a proof of concept using Bayesian Network, linking the admission background and performance of the STAR control. The proof of concept shows 71.43% and 73.90% overall inference precision, and reliability, respectively, on the test dataset. With more data, improved Bayesian Network is believed to be reproduced. These results, nevertheless, points at the feasibility of the network to act as an effective classifier using intensive care units data, and glycemic control performance to be the basis of a probabilistic, personalized, and automated decision support in the intensive care units.
A comparison between discrete and continuous time Bayesian networks in learning from clinical time series data with irregularity
Background Recently, mobile devices, such as smartphones, have been introduced into healthcare research to substitute paper diaries as data-collection tools in the home environment. Such devices support collecting patient data at different time points over a long period, resulting in clinical time-series data with high temporal complexity, such as time irregularities. Analysis of such time series poses new challenges for machine-learning techniques. The clinical context for the research discussed in this paper is home monitoring in chronic obstructive pulmonary disease (COPD). Objective The goal of the present research is to find out which properties of temporal Bayesian network models allow to cope best with irregularly spaced multivariate clinical time-series data. Methods Two mainstream temporal Bayesian network models of multivariate clinical time series are studied: dynamic Bayesian networks, where the system is described as a snapshot at discrete time points, and continuous time Bayesian networks, where transitions between states are modeled in continuous time. Their capability of learning from clinical time series that vary in nature are extensively studied. In order to compare the two temporal Bayesian network types for regularly and irregularly spaced time-series data, three typical ways of observing time-series data were investigated: (1) regularly spaced in time with a fixed rate; (2) irregularly spaced and missing completely at random at discrete time points; (3) irregularly spaced and missing at random at discrete time points. In addition, similar experiments were carried out using real-world COPD patient data where observations are unevenly spaced. Results For regularly spaced time series, the dynamic Bayesian network models outperform the continuous time Bayesian networks. Similarly, if the data is missing completely at random, discrete-time models outperform continuous time models in most situations. For more realistic settings where data is not missing completely at random, the situation is more complicated. In simulation experiments, both models perform similarly if there is strong prior knowledge available about the missing data distribution. Otherwise, continuous time Bayesian networks perform better. In experiments with unevenly spaced real-world data, we surprisingly found that a dynamic Bayesian network where time is ignored performs similar to a continuous time Bayesian network. Conclusion The results confirm conventional wisdom that discrete-time Bayesian networks are appropriate when learning from regularly spaced clinical time series. Similarly, we found that time series where the missingness occurs completely at random, dynamic Bayesian networks are an appropriate choice. However, for complex clinical time-series data that motivated this research, the continuous-time models are at least competitive and sometimes better than their discrete-time counterparts. Furthermore, continuous-time models provide additional benefits of being able to provide more fine-grained predictions than discrete-time models, which will be of practical relevance in clinical applications.
Using the Bayesian Network to Map Large-Scale Cropping Intensity by Fusing Multi-Source Data
Global food demand will increase over the next few decades, and sustainable agricultural intensification on current cropland may be a preferred option to meet this demand. Mapping cropping intensity with remote sensing data is of great importance for agricultural production, food security, and agricultural sustainability in the context of global climate change. However, there are some challenges in large-scale cropping intensity mapping. First, existing indicators are too coarse, and fine indicators for measuring cropping intensity are lacking. Second, the regional, intra-class variations detected in time-series remote sensing data across vast areas represent environment-related clusters for each cropping intensity level. However, few existing studies have taken into account the intra-class variations caused by varied crop patterns, crop phenology, and geographical differentiation. In this research, we first presented a new definition, a normalized cropping intensity index (CII), to quantify cropping intensity precisely. We then proposed a Bayesian network model fusing prior knowledge (BNPK) to address the issue of intra-class variations when mapping CII over large areas. This method can fuse regional differentiation factors as prior knowledge into the model to reduce the uncertainty. Experiments on five sample areas covering the main grain-producing areas of mainland China proved the effectiveness of the model. Our research proposes the framework of obtain a CII map with both a finer spatial resolution and a fine temporal resolution at a national scale.
An Object-Oriented Bayesian Framework for the Detection of Market Drivers
We use Object Oriented Bayesian Networks (OOBNs) to analyze complex ties in the equity market and to detect drivers for the Standard & Poor’s 500 (S&P 500) index. To such aim, we consider a vast number of indicators drawn from various investment areas (Value, Growth, Sentiment, Momentum, and Technical Analysis), and, with the aid of OOBNs, we study the role they played along time in influencing the dynamics of the S&P 500. Our results highlight that the centrality of the indicators varies in time, and offer a starting point for further inquiries devoted to combine OOBNs with trading platforms.
Experiential avoidance and excessive smartphone use: a Bayesian approach
The smartphone is a common tool in our everyday lives. However, recent research suggests that using the smartphone has both positive and negative consequences. Although there is no agreement on the concept or the term to label it, researchers and clinical practitioners are worried about the negative consequences derived from excessive smartphone usage. This study aims to analyse the relationship between smartphone addiction and experiential avoidance. A sample of 1176 participants (828 women) with ages ranging from 16 to 82 (M = 30.97; SD = 12.05) was used. The SAS-SV scale was used to measure smartphone addiction and the AAQ-II to assess experiential avoidance. To model the relationship between variables, Bayesian inference and Bayesian networks were used. The results show that experiential avoidance and social networks usage are directly related to smartphone addiction. Additionally, the data suggests that sex is playing a mediating role in the observed relationship between these variables. These results are useful for understanding healthy and pathological interaction with smartphones and could be helpful in orienting or planning future psychological interventions to treat smartphone addiction.
Revealing the structure of the associations between housing system, facilities, management and welfare of commercial laying hens using Additive Bayesian Networks
After the ban of battery cages in 1988, a welfare control programme for laying hens was developed in Sweden. Its goal was to monitor and ensure that animal welfare was not negatively affected by the new housing systems. The present observational study provides an overview of the current welfare status of commercial layer flocks in Sweden and explores the complexity of welfare aspects by investigating and interpreting the inter-relationships between housing system, production type (i.e. organic or conventional), facilities, management and animal welfare indicators. For this purpose, a machine learning procedure referred to as structure discovery was applied to data collected through the welfare programme during 2010–2014 in 397 flocks housed in 193 different farms. Seventeen variables were fitted to an Additive Bayesian Network model. The optimal model was identified by an exhaustive search of the data iterated across incremental parent limits, accounting for prior knowledge about causality, potential over-dispersion and clustering. The resulting Directed Acyclic Graph shows the inter-relationships among the variables. The animal-based welfare indicators included in this study – flock mortality, feather condition and mite infestation – were indirectly associated with each other. Of these, severe mite infestations were rare (4% of inspected flocks) and mortality was below the acceptable threshold (< 0.6%). Feather condition scored unsatisfactory in 21% of the inspected flocks; however, it seemed to be only associated to the age of the flock, ruling out any direct connection with managerial and housing variables. The environment-based welfare indicators – lighting and air quality – were an issue in 5 and 8% of the flocks, respectively, and showed a complex inter-relationship with several managerial and housing variables leaving room for several options for intervention. Additive Bayesian Network modelling outlined graphically the underlying process that generated the observed data. In contrast to ordinary regression, it aimed at accounting for conditional independency among variables, facilitating causal interpretation.
First use of participatory Bayesian modeling to study habitat management at multiple scales for biological pest control
Habitat management is increasingly considered as a promising approach to favor the ecosystem service of biological control by enhancing natural enemies. However, habitat management, whether at local or landscape scale, remains very uncertain for farmers. Interactions between ecological processes and agricultural practices are indeed uncertain and site-specific, which makes implementation difficult. Thus, prospecting innovations based on habitat management may benefit from integrating local stakeholders and their knowledge. Our objective is to explore with both local and scientific stakeholders how they perceive agricultural practices, ecological processes, and services related to biological pest control and habitat management. We conducted a participatory Bayesian Network modeling approach with five stakeholders in Southwest France around apple orchard cultivation. We co-constructed such Bayesian Networks based on participants’ knowledge. We explored scenarios favoring natural enemies and habitat manipulation with each participant’s Bayesian Network. We compared how different stakeholders perceive the impact of each scenario on the biological control ecosystem service. Our results indicate that a landscape with a high proportion of semi-natural habitats does not translate into significant biological control for most participants even though some stakeholders perceive a significant impact on generalist predators’ activity within orchards. For these local stakeholders, habitat management at the orchard level such as inter-row vegetation seems currently more promising than at the landscape scale. Here, we show for the first time that the use of Bayesian modeling in a participatory manner can give precious insights into the most promising perspectives on habitat management at different scales. These different local perspectives suggest in particular that further dialogue between ecologists and local stakeholders should be sought about inter-row habitat management as the most promising practice to foster biological pest control and other ecosystem services.
A Regional Application of Bayesian Modeling for Coastal Erosion and Sand Nourishment Management
This paper presents an application of the Bayesian belief network for coastal erosion management at the regional scale. A “Bayesian ERosion Management Network” (BERM-N) is developed and trained based on yearly cross-shore profile data available along the Holland coast. Profiles collected for over 50 years and at 604 locations were combined with information on different sand nourishment types (i.e., beach, dune, and shoreface) and volumes implemented during the analyzed time period. The network was used to assess the effectiveness of nourishments in mitigating coastal erosion. The effectiveness of nourishments was verified using two coastal state indicators, namely the momentary coastline position and the dune foot position. The network shows how the current nourishment policy is effective in mitigating the past erosive trends. While the effect of beach nourishment was immediately visible after implementation, the effect of shoreface nourishment reached its maximum only 5–10 years after implementation of the nourishments. The network can also be used as a predictive tool to estimate the required nourishment volume in order to achieve a predefined coastal erosion management objective. The network is interactive and flexible and can be trained with any data type derived from measurements as well as numerical models.
A Bayesian network based learning system for modelling faults in large-scale manufacturing
Manufacturing companies can benefit from the early prediction and detection of failures to improve their product yield and reduce system faults through advanced data analytics. Whilst an abundance of data on their processing systems exist, they face difficulties in using it to gain insights to improve their systems. Bayesian networks (BNs) are considered here for diagnosing and predicting faults in a large manufacturing dataset from Bosch. Whilst BN structure learning has been performed traditionally on smaller sized data, this work demonstrates the ability to learn an appropriate BN structure for a large dataset with little information on the variables, for the first time. This paper also demonstrates a new framework for creating an appropriate probabilistic model for the Bosch dataset through the selection of statistically important variables on the response; this is then used to create a BN network which can be used to answer probabilistic queries and classify products based on changes in the sensor values in the production process.
Partial Least Squares Discriminant Analysis and Bayesian Networks for Metabolomic Prediction of Childhood Asthma
To explore novel methods for the analysis of metabolomics data, we compared the ability of Partial Least Squares Discriminant Analysis (PLS-DA) and Bayesian networks (BN) to build predictive plasma metabolite models of age three asthma status in 411 three year olds (n = 59 cases and 352 controls) from the Vitamin D Antenatal Asthma Reduction Trial (VDAART) study. The standard PLS-DA approach had impressive accuracy for the prediction of age three asthma with an Area Under the Curve Convex Hull (AUCCH) of 81%. However, a permutation test indicated the possibility of overfitting. In contrast, a predictive Bayesian network including 42 metabolites had a significantly higher AUCCH of 92.1% (p for difference < 0.001), with no evidence that this accuracy was due to overfitting. Both models provided biologically informative insights into asthma; in particular, a role for dysregulated arginine metabolism and several exogenous metabolites that deserve further investigation as potential causative agents. As the BN model outperformed the PLS-DA model in both accuracy and decreased risk of overfitting, it may therefore represent a viable alternative to typical analytical approaches for the investigation of metabolomics data.
Risk Assessment of Underground Subway Stations to Fire Disasters Using Bayesian Network
Subway station fires often have serious consequences because of the high density of people and limited number of exits in a relatively enclosed space. In this study, a comprehensive model based on Bayesian network (BN) and the Delphi method is established for the rapid and dynamic assessment of the fire evolution process, and consequences, in underground subway stations. Based on the case studies of typical subway station fire accidents, 28 BN nodes are proposed to represent the evolution process of subway station fires, from causes to consequences. Based on expert knowledge and consistency processing by the Delphi method, the conditional probabilities of child BN nodes are determined. The BN model can quantitatively evaluate the factors influencing fire causes, fire proof/intervention measures, and fire consequences. The results show that the framework, combined with Bayesian network and the Delphi method, is a reliable tool for dynamic assessment of subway station fires. This study could offer insights to a more realistic analysis for emergency decision-making on fire disaster reduction, since the proposed approach could take into account the conditional dependency in the fire propagation process and incorporate fire proof/intervention measures, which is helpful for resilience and sustainability promotion of underground facilities.
Modeling interrelationships between health behaviors in overweight breast cancer survivors: Applying Bayesian networks
Obesity and its impact on health is a multifaceted phenomenon encompassing many factors, including demographics, environment, lifestyle, and psychosocial functioning. A systems science approach, investigating these many influences, is needed to capture the complexity and multidimensionality of obesity prevention to improve health. Leveraging baseline data from a unique clinical cohort comprising 333 postmenopausal overweight or obese breast cancer survivors participating in a weight-loss trial, we applied Bayesian networks, a machine learning approach, to infer interrelationships between lifestyle factors (e.g., sleep, physical activity), body mass index (BMI), and health outcomes (biomarkers and self-reported quality of life metrics). We used bootstrap resampling to assess network stability and accuracy, and Bayesian information criteria (BIC) to compare networks. Our results identified important behavioral subnetworks. BMI was the primary pathway linking behavioral factors to glucose regulation and inflammatory markers; the BMI-biomarker link was reproduced in 100% of resampled networks. Sleep quality was a hub impacting mental quality of life and physical health with > 95% resampling reproducibility. Omission of the BMI or sleep links significantly degraded the fit of the networks. Our findings suggest potential mechanistic pathways and useful intervention targets for future trials. Using our models, we can make quantitative predictions about health impacts that would result from targeted, weight loss and/or sleep improvement interventions. Importantly, this work highlights the utility of Bayesian networks in health behaviors research.
Impact of drivers of change, including climatic factors, on the occurrence of chemical food safety hazards in fruits and vegetables: a Bayesian Network approach
The presence and development of many food safety risks are driven by factors within and outside the food supply chain, such as climate, economy and human behaviour. The interactions between these factors and the supply chain are complex and a system or holistic approach is needed to reveal cause-effect relationships and to be able to perform effective mitigation actions to minimise food safety risks. In this study, we demonstrate the potential of the Bayesian Network (BN) approach to identify and quantify the strength of relationships and interactions between the presence of food safety hazards as reported in Rapid Alert System for Food and Feed (RASFF) for fruits and vegetables on one hand, and climatic factors, economic and agronomic data on the other. To this end, all food safety notifications in RASFF (i.e. 3,781 notifications) on fruits and vegetables originating from India, Turkey and the Netherlands were collected for the period 2005-2015. In addition, climatic factors (e.g. temperature, precipitation), agricultural factors (e.g. pesticide use, fertilizer use) and economic factors (e.g. price, production volumes) were collected for the countries of origin of the product concurrent with the period of food safety notification in RASFF. A BN was constructed with 80% of the collected data using a machine-learning algorithm and optimised for each specific hazard category. The performance of the developed BN was determined in terms of accuracy of prediction of the hazard category in the evaluation set comprising 20% of the total data. The accuracy was high (95%) and the following factors contributed most: product category, notifying country, yearly production, number of notification, maximal residue level (MRL) ratio, country of origin, and the annual agricultural budget of a country. The assessment of the impact of interactions within the BN showed a significant interaction between the presence and level of a hazard as reported in RASFF and several drivers of change but at present, no definite conclusions can be drawn regarding the climatic factors and food safety hazards.
Modelling Electronic Trust Using Bayesian Networks
This paper discusses importance of trust in the context of digital economy. Even though electronic commerce continues to grow worldwide due to many of its advantages, it has not been fully adopted yet. The reason for some barriers in adopting e-commerce lies in potential customers who still perceive online setting as quite risky. Customers who have concerns related to sellers’ IT infrastructure resilience, and secured and safe personal data, will hardly ever engage in e-transactions. The nature of trust is very subjective, complex and multi-faceted. Trust issues are not present only between buyers and sellers, but also between suppliers and sellers, trust in recommendations and references on certain products, etc. In this paper authors propose modelling trust using Bayesian networks and provide an illustrative example which is typical in online transactions.
Probabilistic Age Classification with Bayesian Networks
In the past few decades, the rise of criminal, civil and asylum cases involving young people lacking valid identification documents has generated an increase in the demand of age estimation. The chronological age or the probability that an individual is older or younger than a given age threshold are generally estimated by means of some statistical methods based on observations performed on specific physical attributes. Among these statistical methods, those developed in the Bayesian framework allow the user to provide coherent and transparent assignments which fulfill forensic and medico-legal purposes. The application of the Bayesian approach is facilitated by using probabilistic graphical tools, such as Bayesian networks. The aim of this work is to test the performances of the Bayesian network for age estimation recently presented in scientific literature in classifying individuals as older or younger than 18 years of age. For these exploratory analyses, a sample related to the ossification status of the medial clavicular epiphysis available in scientific literature was used. Results obtained in the classification are extremely promising: in the criminal context, the Bayesian network achieved, on the average, a rate of correct classifications of approximatively 97%, whilst in the civil context, the rate is, on the average, close to the 88%. These results encourage the continuation of the development and the testing of the method in order to support its practical application in casework.