ReviewA methodology for performing global uncertainty and sensitivity analysis in systems biology
Introduction
Systems biology is the study of the interactions between the components of a biological system, and how these interactions give rise to the function and behavior of the system as a whole. The systems biology approach often involves the development of mathematical or computer models, based on reconstruction of a dynamic biological system from the quantitative properties of its elementary building blocks. Building mathematical and computational models is necessary to help decipher the massive amount of data experimentalists are uncovering today. The goal of the systems biologist or modeler is to represent, abstract, and ultimately understand the biological world using these mathematical and computational tools. Experimental data that are available for each system should guide, support, and shape the model building process. This can be a daunting task, especially when the components of a system form a very complex and intricate network.
Paraphrasing Albert Einstein, models should be as simple as possible, but not simpler. A parsimonious approach must be followed. Otherwise, if every mechanism and interaction is included, the resulting mathematical model will be comprised of a large number of variables, parameters, and constraints, most of them uncertain because they are difficult to measure experimentally, or are even completely unknown in many cases. Even when a parsimonious approach is followed during model building, available knowledge of phenomena is often incomplete, and experimental measures are lacking, ambiguous, or contradictory. So the question of how to address uncertainties naturally arises as part of the process. Uncertainty and sensitivity (US) analysis techniques help to assess and control these uncertainties.
Uncertainty analysis (UA) is performed to investigate the uncertainty in the model output that is generated from uncertainty in parameter inputs. Sensitivity analysis (SA) naturally follows UA as it assesses how variations in model outputs can be apportioned, qualitatively or quantitatively, to different input sources (Saltelli et al., 2000). In this work we review US analysis techniques in the context of deterministic dynamical models in biology, and propose a novel procedure to deal with a particular stochastic, discrete type of dynamical model (i.e. an agent-based model—ABM1).
By deterministic model, we mean that the output of the model is completely determined by the input parameters and structure of the model. The same input will produce the same output if the model were simulated multiple times. Therefore, the only uncertainty affecting the output is generated by input variation. This type of uncertainty is termed epistemic (or subjective, reducible, type B uncertainty; see Helton et al., 2006). Epistemic uncertainty derives from a lack of knowledge about the adequate value for a parameter/input/quantity that is assumed to be constant throughout model analysis. In contrast, a stochastic model will not produce the same output when repeated with the same inputs because of inherent randomness in the behavior of the system. This type of uncertainty is termed aleatory (or stochastic, irreducible, type A; see Helton et al., 2006). This distinction has been and still is an area of interest and study in the engineering and risk assessment community (see Apostolakis, 1990; Helton, 1997; Helton et al., 2007; Parry and Winter, 1981; Pate’-Cornell, 1996).
Many techniques have been developed to address US analysis: differential analysis, response surface methodology, Monte Carlo (MC) analysis, and variance decomposition methods. See Iman and Helton, (1988) and Saltelli et al. (2000) for details on each of these approaches and Cacuci and Ionescu-Bujor (2004), Draper (1995), Helton (1993) and Saltelli et al. (2005) for more general reviews on US analysis. Here we briefly illustrate the most popular, reliable, and efficient UA techniques and SA indexes. In Section 2, we describe two UA techniques: a MC approach and Latin hypercube sampling (LHS). In Section 3, we describe two SA indexes: partial rank correlation coefficient (PRCC) and extended Fourier amplitude sensitivity test (eFAST): PRCC is a sampling-based method, while eFAST is a variance-based method. In Section 4, we perform US analysis on both new and familiar deterministic dynamical models (quantifying epistemic uncertainty) from epidemiology and immunology, and discuss results. Section 5 presents an ABM, where we suggest a method to deal with the aleatory uncertainty that results from the stochasticity embedded in the model structure, to facilitate the use of PRCC and eFAST techniques. We use Matlab (Copyright 1984–2006 The MathWorks, Inc., Version 7.3.0.298 R2006b) to solve all the differential equation systems of Section 4 and to implement most of the US analysis functions described throughout the manuscript (available on our website, http://malthus.micro.med.umich.edu/lab/usanalysis.html).
Section snippets
Uncertainty analysis
Input factors for most mathematical models consist of parameters and initial conditions for independent and dependent model variables. As mentioned, these are not always known with a sufficient degree of certainty because of natural variation, error in measurements, or simply a lack of current techniques to measure them. The purpose of UA is to quantify the degree of confidence in the existing experimental data and parameter estimates. In this section we describe the most popular sampling-based
Sensitivity analysis
SA is a method for quantifying uncertainty in any type of complex model. The objective of SA is to identify critical inputs (parameters and initial conditions) of a model and quantifying how input uncertainty impacts model outcome(s). When input factors such as parameters or initial conditions are known with little uncertainty, we can examine the partial derivative of the output function with respect to the input factors. This sensitivity measure can easily be computed numerically by performing
Uncertainty and sensitivity analysis examples
Since the relationship (including monotonicity) between parameters and outputs is not typically known a priori, then in principle using both PRCC and eFAST methods is ideal. The drawback is that issues related to accuracy of results and computational costs may arise. To illustrate the differences between these methods, we implement both PRCC and eFAST and compare the results for different types of mathematical models in biology: three different ODE systems (Lotka–Volterra, cell population
Uncertainty and sensitivity analysis in agent-based models
ABMs (also called “individual-based models”) are a formalism evolved from early research in cellular automata and artificial life. The defining feature of ABMs is that elements of the system are represented as discrete agents that move and interact according to defined rules, in an explicitly defined spatial environment. Stochasticity enters the model as some decision-making rules can be based on random chance, such as a random walk movement of cells. In an ABM, the individual, possibly
Discussion and conclusion
Uncertainty and sensitivity (US) analyses offer a way to assess the adequacy of models and establish what factors affect model outputs. We reviewed and compared two specific types of global sensitivity analysis (SA) indexes that have proven to be among the most reliable and efficient, namely a sampling-based method (partial rank correlation coefficient—PRCC) and a variance-based method (extended Fourier amplitude sensitivity test—eFAST). All functions used throughout the paper are available on
Acknowledgments
This work was supported by NIH Grants HL68526, LM 009027, and DAIT-BAA-05-10. We are grateful to Jennifer Linderman, Rick Riolo, and to the members of the Kirschner laboratory for helpful discussions. Also, comments by two anonymous reviewers were invaluable in making this manuscript as comprehensive and clear as possible.
References (59)
- et al.
Use of probabilistic expert judgment in uncertainty analysis of carcinogenic potency
Regul. Toxicol. Pharmacol.
(1994) Uncertainty and sensitivity analysis techniques for use in performance assessment for radioactive-waste disposal
Reliab. Eng. Syst. Saf.
(1993)Uncertainty and sensitivity analysis in performance assessment for the waste isolation pilot plant
Comput. Phys. Commun.
(1999)- et al.
Calculation of reactor accident safety goals
Reliab. Eng. Syst. Saf.
(1993) - et al.
Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems
Reliab. Eng. Syst. Saf.
(2003) - et al.
Uncertainty and sensitivity analysis of early exposure results with the Maccs reactor accident consequence model
Reliab. Eng. Syst. Saf.
(1995) - et al.
Characterization of subjective uncertainty in the 1996 performance assessment for the waste isolation pilot plant
Reliab. Eng. Syst. Saf.
(2000) - et al.
Survey of sampling-based methods for uncertainty and sensitivity analysis
Reliab. Eng. Syst. Saf.
(2006) - et al.
A sampling-based computational strategy for the representation of epistemic uncertainty in model predictions with evidence theory
Comput. Methods Appl. Mech. Eng.
(2007) - et al.
A distribution-free test for the relationship between model input and output when using Latin hypercube sampling
Reliab. Eng. Syst. Saf.
(2003)