What problem does Bayesian Networks aim to address?
Agricultural development practitioners face the unique challenge of making scientifically-based decisions where data is scarce, various factors are difficult to measure, multiple interdependent uncertainties exist, and many unique and sometimes conflicting stakeholder perspectives are equally legitimate and valuable. In order to objectively and accurately analyse such complex scenarios, new approaches are needed that can account for scant data, uncertainties, and a multitude of driving forces. Bayesian Networks are powerful modelling tools that accomplish just that.
How are Bayesian Networks useful in agricultural interventions?
Bayesian Networks (BNs) excel at predicting outcomes and identifying causes within systems that feature great uncertainty, nonlinearity, feedbacks between components, and data scarcity. They are capable of integrating data of various types (e.g., qualitative and quantitative) and from various sources (e.g., meta-analysis and primary data), estimating parameters with few data points, and uncovering previously hidden relationships. Importantly, their causal flexibility enables an inclusive participatory model development process and the capturing of stakeholders’ priorities, knowledge, and perspectives within the model itself. BNs also set the foundation for iterative improvement in the system by incorporating and making inferences with new data as it becomes available.
BNs are machine learning-based graphical structures that model information about causal and associational relations for the problem at hand under uncertain conditions. Parameters representing the strength of these relations are encoded based on data, statistics, and expert knowledge. Powerful algorithms then compute probabilistic inference based on the structure and parameters.
From these inferences, the BN produces a directed acyclic graph composed of nodes and arcs. The nodes represent variables, and the arcs represent relationships between those variables. Conditional independence between the nodes is encoded within the graph. For example, when two nodes are directly connected (as in AàB) they are respectively called the parent and child. The strength of each node’s relations to its parent nodes is defined by conditional probability distributions. Hence, the BN as a whole represents the joint probability distribution of its nodes.
Why Bayesian Networks instead of other simulation models, like Monte Carlo?
BNs that contain both discrete and continuous variables are called hybrid BNs. Early BN algorithms were unable to model continuous data. Such data had to be manually discretized, which led to inaccuracies. Introduction of the Dynamic Discretization algorithm has eliminated this limitation. It can compute hybrid BNs with virtually any statistical distribution and deterministic function, and has been successfully implemented in a wide variety of domains. As a result, BNs now offer important advantages over other modelling tools.
For the past 60 years, Monte Carlo (MC) simulation models have been the go-to choice for calculating probabilistic risk assessment, and particularly project risk analysis. They have recently gained popularity in evaluating agricultural development investments. For example, Wafula et al. (2018) used MC to evaluate investment options in Kenyan honey value chains, and Lanzanova et al. (2019) used MC to prioritize reservoir protection investments in Burkina Faso.
MCs are spreadsheet-based and very easy to implement. They work by repeatedly generating samples for the random variables in the model, and making a statistical analysis of those samples. The main limitation of this approach is that the probability distributions of conditional relationships between variables cannot be revised within backward inferences. Rather, the structure and parameters of the model must be altered in order to make such inferences. Consequently, MC simulation models are often computed using intermediate variables. Another limitation of MS is that, although the modelling assumptions are encoded in the spreadsheet, interpretation of these assumptions in large models is often difficult.
Bayesian Networks overcome these limitations. BNs use algorithms that compute backward inference without needing to change the model structure and parameters, and the graphical structure of BN models explicitly show modelling assumptions, thus facilitating interpretation.
What are typical applications of Bayesian Networks?
BNs have been applied to a wide variety of real-world problems in fields ranging from biomedicine to petrophysics. Within the agricultural domain, BNs have been used to analyse crop management, yield prediction, greenhouse gas emission, farm profitability, and policy evaluation. For example, Raggi et al. (2010) used BNs to predict whether farmers in rural northern Italy intended to exit the agricultural sector as a result of new policies, and Richards et al. (2013) used BNs to develop and evaluate strategies for adapting to climate change. Whitney et al. ( 2018) used BNs to evaluate the impact of Uganda’s agricultural development policies on nutrition. Agena Ltd’s AgenaRisk model is based on BNs.
How has P4S used Bayesian Networks?
Despite their advantages in agricultural development scenarios, the use of BN in decision making for agricultural development has been limited. In particular, BNs have not been widely implemented for evaluating agricultural investment options. We believe there is significant opportunity for using BNs to objectively analyse agricultural investment choices. As such, Partnerships for Scaling has developed an innovative methodology for using BNs in planning and comparing climate-smart agriculture investments.
Our model offers a data-driven solution for data-scarce problems, explicit incorporation of risk, and an inclusive participatory process. It systematically guides the practitioner in defining impact, cost, and risk factors (such as extreme weather, socio-politics, logistics, and pests) of a project using established financial metrics, such as Return on Investment, as well as the knowledge of domain experts. The causal pathways are developed via an inclusive participatory stakeholder process, creating a level of ownership and control that is unique among modelling approaches. Expected return and uncertainty of project outcomes are then computed under different risk scenarios. The BN evaluates the effect of each risk scenario on project adoption, costs, and impact value for each year of the project. What-if analyses can also be conducted.
We have illustrated the power of this approach through planning large scale agricultural investments under climate change for African countries. We piloted the model by prioritizing investments in Tanzania’s Agriculture and Food Security Investment Plan (TAFSIP) with government, research, and civil society actors. The TAFSIP BN model was built on a one-of-a-kind database compiled from nearly 1,500 agricultural studies on 50 indicators of productivity (e.g. net returns, resource use efficiency), resilience (e.g. soil health, labor), and sustainability (e.g. carbon dioxide emissions). For factors that are hard to measure or for which data is not typically available (e.g., gender and youth, adoption rates), we derived ranges of potential values from stakeholder and expert opinions. Progress and impacts were conditioned on the frequency and severity of financial, climactic, logistical, and political risks, offering improved cost : benefit and what-if analyses.
Photo credit: C Schubert.