Harvard Reports in Physical/Interdisciplinary Ocean Science
#67
A generalized prognostic model
of marine biogeochemicalecosystem dynamics: Structure, parameterization and adaptive modeling
May 2004
Rucheng C. Tian, Pierre F.J. Lermusiaux,
James J. McCarthy and Allan R. Robinson
Harvard University
Division of Engineering and Applied Sciences
Department of Earth and Planetary Sciences
29 Oxford Street, Cambridge MA 02138
Capturing Uncertainty in the Common Tactical Environmental Picture
4 March 2004
Edited by A.R. Robinson
Contents
Abstract

Introduction

Review of biogeochemicalecosystem modeling up to date

Structure

Parameterization

Phytoplankton

Light forcing on phytoplankton growth rate

Temperature forcing on phytoplankton growth rate

Nutrient limitation on phytoplankton growth rate

DOM exudation and mortality of phytoplankton

Zooplankton

Grazing functions

Temperature forcing on zooplankton growth

Zooplankton mortality, excretion and respiration

Zooplankton diel vertical migration

Bacteria

Dissolved organic matter (DOM)

Biogenic detritus

Nutrients

Auxiliary state variables
3.7.1. Chlorophyll a
3.7.2. Bioluminescence

Adaptive modeling and realtime ecosystem forecasting

Conclusions
References
ABSRACT
In this report, we present a generalized and flexible prognostic model for marine biogeochemicalecosystem processes. This generalized model currently focuses on pelagic dynamics and consists of 7 functional groups of state variables: nutrients, phytoplankton, zooplankton, detritus, dissolved organic matter, bacteria and auxiliary state variables. The number of components in each functional group is flexible varying from 1 to n and their corresponding biological, chemical or detrital features are defined by users. Primarily, categories of dissolved organic matter are classified by their bioavailability, bacteria by their ability to respond to environmental variations, nutrients by their chemical composition, and phytoplankton, zooplankton and detritus by their size. A different classification can be used, however, such as species or stages of zooplankton. Auxiliary state variables represent oceanic properties that can depend directly on other biological features and are thus not always independent prognostic components in food web structure and trophic dynamics. Such auxiliary variables can include for example chlorophyll, bioluminescence, dissolved oxygen, CO_{2} and phycotoxins. The model was constructed so as to allow easy identification and adaptation, i.e., the state variables, model structures and parameter values can be changed in response to field measurements, ecosystem functions and different scientific objectives. The model is currently coupled with the Harvard Ocean Prediction System (HOPS) for realtime ocean forecasting.
1. INTRODUCTION
Marine ecosystems are usually represented by compartment models, each compartment representing a trophic level or taxonomic group such as phytoplankton, zooplankton or nutrient. In the real ocean, however, there are organisms of unnumbered species, at various stages and of different weights at each trophic level. The structural design of an ecological model is thus critical in determining its adequacy and capability to approach real ecosystems.
Since the first marine ecological model of phytoplankton (Riley, 1946), the variety of ecological dynamics being modeled has widen and there has been a trend towards increasing complexity in model structure. For example, Andersen et al. (1987) found it necessary to partition phytoplankton into diatoms and flagellates, and zooplankton into copepods and appendicularia. Moloney and Field (1991) partitioned the autotrophic level into pico, nano and netphytoplankton, and the heterotrophic trophic level into bacteria, heterotrophic flagellates, microzooplankton and mesozooplankton. Moisan and Hofmann (1996) simulated gelatinous zooplankton, copepods and euphausiids whereas Pace et al. (1984) considered bacteria, protozoa, grazing zooplankton, carnivorous zooplankton, mucous net feeders and pelagic fishes at the heterotrophic level. Tian et al. (2000) coupled the microbial food web with the traditional mesoplankton food chain. Wroblewski (1982) distinguished five life stages of copepods: eggs, nauplii, early copepodites (CIIII), copepodites (CIVCV) and adult. Armstrong (1994) proposed a model with n parallel food chains, each consisting of a phytoplankton species P_{i} and its dedicated zooplankton predator Zi classified by size. Nihoul and Djenidi (1998) presented a conceptual food web model in which phytoplankton are divided into cynobacteria, ultraphytoflagellates, phytoflagellates, diatoms and autotrophic dinoflagellates, and zooplankton are divided into heterotrophic flagellates, ciliates, heterotrophic dinoflagellates, arthropods and gelatinous zooplankton. Although many of the biological models are nitrogen driven, Lancelot et al. (2000) considered five types of nutrients (Si, PO_{4}^{}, NO_{3}^{}, NH_{4}^{+} and Fe). Cousins (1980) suggested a conceptual trophic continuum model in which an ecosystem is divided into three basic components (autotrophs, heterotrophs and detritus), with each component representing a size continuum from small to large organisms or particles. Armstrong (1999) argued that biological models of pelagic ecosystems should reflect both the taxonomic and size structure of the plankton community. Considering all of these advances as a whole, the main modeling challenge relates to the complexity of the physiological, trophic and ecological dynamics. This complexity prohibits the development of a single model with fixed structure that can be applied to various oceanic ecosystems and be used for various scientific objectives. Ideally, generalized and flexible models can be more efficient and useful. Developing such a model is the main objective of the present research.
Another challenge relates to the lack of a common set of equations that represent the source and sink terms of the biological state variables (e.g., Robinson and Lermusiaux, 2002). There are currently no standard parameterizations in marine ecological modeling. A wide variety of mathematical formulations have been developed to describe fundamental biological processes and forcing functions, such as those of light, nutrient and temperature on phytoplankton growth rate and on zooplankton grazing and predation. These formulations are usually based on empirical relationships, for example expressing a correlation with measurable variables. The real ecological or physiological processes underlying the observed correlation are not often explicit in these relationships. Sound statistical or physiological bases to reject one or another parameterization are rare (Sakshaug et al., 1997), but the choice among them can be critical with respect to the model functionality (Franks et al., 1986; Smith et al., 1989; Steele and Hendersen, 1992; Gao et al., 2000; Gentleman et al., 2003). Although we can infinitely divide an ecosystem model, our knowledge of biological dynamics and rates is limited, but evolving. Ecological modeling largely relies on field data to validate simulations. An ideal and intelligent modeling system should thus learn from data and model misfits, and quantitatively identify the most adequate formulations. The present model development is simply a prerequisite to such automated system identification (e.g. Duda et al, 2000).
Importantly, oceanic states evolve and go through transitions. In the coastal environment, properties can be highly variable and intermittent on multiple scales. The biogeochemicalecosystem variables, processes and interactions that matter vary with time and space. For efficient forecasting, prognostic models should thus have the same behavior and adapt to the everchanging dynamics (Lermusiaux et al., 2004). This is especially important for marine ecosystems. With adaptive modeling, model switches and improvements are dynamic and aim to continuously select the best model structures and parameters among different physical or biogeochemical parameterizations. It is a system identification that is dynamic and sustained in time and space.
In summary, the properties of the modeling system that cope with various ecosystems, multiple scientific objectives, unknown equations and everchanging dynamics are generality, flexibility, identification and adaptation, respectively. This report is mostly on generality and flexibility of the biogeochemicalecosystem model. System identification and adaptive modeling, and the corresponding automations, are only touched upon. Of course, even though the model formulation and its computational structure aim to be general, to initiate the functional implementations and survey the relevant literature, some focus is necessary. First, the oceanic region for which the model is being developed is the pelagic (littoral, neritic and openoceanic) midlatitude ocean, including coastaldeep sea interactions. Benthic processes and some specifics of low and high latitudes are for example not yet explicitly implemented (even though the new model structure could accommodate the corresponding variables and functionalities). Second, the scales considered are from hundreds of meters to thousands of kilometers, and hours to decades. The Eulerian formulation is thus preferred to the Lagrangian formulation and to the Individual Based Models (IBMs, e.g., DeAngelis and Gross, 1992) and ordinary population dynamics formalisms. All prognostic equations are of the partial differential, advectivediffusivereactive type.
Our generalized model has been scientifically constructed based on a study of all possible functional groups and parameterizations relevant for midlatitude coastal ecosystems such as Massachusetts Bay and Monterey Bay. A general set of state variables and of mathematical structures representing the interactions of these variables was selected, based on importance, completeness, efficiency and accuracy. This led to a generalized model with the following functional groups of state variables: nutrients, phytoplankton, zooplankton, detritus, dissolved organic matter, bacteria and auxiliary variables. Firstly, the generalized model is flexible on three levels: (i) the functional groups that are relevant can be determined and selected; (ii) within each functional group, the number of state variables is not fixed but is defined by the user; and (iii) the interactions among state variables within and across functional groups are also variable. Secondly, the generalized model is modular: one can select the model structures and parameterizations that are most adequate for the application of interest. In some cases, changes at each trophic level can also result in automatic changes in specific parameterizations. With this flexibility, the generalized biogeochemical model will be able to adapt to different ecosystems, scientific objectives and availability of field measurements.
Such a generalized model has several advantages. It can simulate various ecosystems by using a subset of its generalized state variables. It can adapt to the changing ecosystem dynamics by changing its state variables, structures and parameters. Importantly, it provides an efficient numerical tool for realtime ecosystem forecasting. This is because one can downsize the model configuration to the simplest one that is sufficiently accurate to achieve the specific forecasting goals. In general, we expect the utilization of the generalized model for a given application to proceed as follows. First, an a priori model, i.e. the state variables, their interactions, the parameterizations and the parameter values, is chosen. As new data are collected, as events occur, as the dynamics changes or as the scientific objectives are modified, the configuration and specificities of the a priori model are updated, tuned and evolved. The result of this identification and adaptation is a fitted a posteriori model. This process can be manual or automated. The latter case is an advanced application of data assimilation (Robinson and Lermusiaux, 2002; Gregoire et al, 2003). By data assimilation, (non)observed state variables and parameters are classically constrained or estimated from the observed data, respectively by field estimation and parameter estimation. Similarly, to obtain the a posteriori model, specific parameterizations and model structures are selected among a set of possible choices, based on quantitative criteria that are a function of modeldata misfits. Importantly, just as some variables or parameters cannot always be estimated unambiguously by data assimilation (e.g. Matear, 1995; Vallino, 2000), some parameterizations and model structures may appear equivalent to the available data (referred to as equifinality, Duda et al, 2000). The complete automation of the adaptation requires substantial research.
In what follows, biogeochemicalecosystem modeling efforts up to date are first briefly reviewed (Sect. 2). The dynamical structures, equations and parameterizations of our generalized prognostic biogeochemicalecosystem model are then defined and described (Sects. 3 and 4). The presentation aims to be comprehensive conceptually and mathematically, but the technical details of the computations and software are not discussed. Finally, the utilization of the generalized and flexible model in an adaptive datadriven modeling mode is briefly discussed (Sect. 5). Our applications of the generalized model and the research towards the automation of model identification and adaptation are not reported.
2. REVIEW OF BIOGEOCHEMICALECOSYSTEM MODELS UP TO DATE
Hofmann and Lascara (1998) and Hofmann and Friedrichs (2002) recently conducted overall reviews of interdisciplinary modeling for marine ecosystems. They provide a detailed description of the progress in marine ecological modeling. Riley (1946, 1947a,b) first used simple mathematical equations to simulate observed seasonal variations in phytoplankton and zooplankton abundance. Parameterizations of biological dynamics were not developed at that time and mathematical formulations were based on linear assumptions that describe a single state variable. Following his earlier efforts, Riley (1949) developed a trophic dynamics model that included nutrient, phytoplankton and zooplankton, i.e., the socalled NPZ model. This model structure has been widely used since then, as NPZ only (Steele, 1974; Evans and Parslow, 1985; Franks et al., 1986) or as NPZ and a detritus pool (e.g., Ishizaka 1990; Nihoul et al, 1993; McGillicuddy et al., 1995a, b). Riley’s earlier efforts have triggered advances in the parameterization of biological dynamics such as the relationships between photosynthesis and light (e.g., Steele, 1962; Vollenweider, 1965; Webb, 1974; Parker, 1974; Bannister, 1979) and nutrient uptake kinetics (Caperon, 1967; Dugdale 1967, Droop, 1968; 1983; Platt et al., 1977; Goldman and McCarthy, 1978; McCarthy, 1981).
About a quarter century later, Walsh (1975) developed an ecological model applied to the Peru upwelling ecosystem. His model included nitrate, phosphate, silicate, phytoplankton, zooplankton and planktonic fish (anchoveta) and detritus. Light forcing, ammonium inhibition on nitrate uptake, grazing and current advection (upwelling) were explicitly implemented. This represented a major advance in marine ecological modeling although some of the parameterizations were based on idealized assumptions, such as formulations of phytoplankton growth rate and zooplankton grazing as a sine or cosine function of time.
Fasham et al (1990) further considered bacteria and dissolved organic nitrogen in their biogeochemical model, using more advanced parameterizations such as switching grazing and predation on multiple types of prey. The inclusion of the bacterial pool allowed the explicit simulation of new production versus regenerated production, which is one of the bases of carbon biogeochemical cycles and sequestration. This model of biological dynamics in the surface mixedlayer has been utilized substantially in the past decade and its application range has been extended, for example to study biogeochemical cycles over basin scales by coupling with 3D general circulation models (e.g. Sarmiento et al., 1995).
Tian et al. (2000) recently coupled the microbial food web (small phytoplankton or picophytoplankton, microzooplankton, bacteria, small suspended detritus, DOM and ammonium) with the traditional mesoplankton food chain (large phytoplankton or diatoms, mesozooplankton, large sinking detritus and nitrate). The model can estimate several secondary quantities, including new production versus regenerated production, DOC versus POC export and active export through zooplankton vertical migration (Tian et al., 2001, 2004). The explicit coupling of the microbial food web with the mesozooplankton food wed allows the simulation of seasonal food web succession and variability that occurs due to climate changes (Tian et al., 2003a, b).
The European Regional Sea Ecosystem Model (ERSEM) is likely the most comprehensive ecosystem simulation model up to date (Vichi et al., 2003). ERSEM consists of 3 submodels (circulation, pelagic and benthic submodels). Each submodel contains several modules (BarettaBekker et al., 1995). The pelagic submodel is composed of phytoplankton, zooplankton, microzooplankton, nutrient, bacteria and fish modules (Baretta et al., 1995). Phytoplankton includes diatoms, flagellates, picophytoplankton and large phytoplankton. Zooplankton includes carnivorous and omnivorous mesozooplankton. Microzooplankton includes heterotrophic flagellates and other microzooplankton. In addition, 5??? nutrients (nitrate, ammonium, phosphate, ??? and silicate), dissolved oxygen, CO_{2}, DOM and POM are all included in the pelagic submodel (Vichi et al., 2003; Baretta et al., 1995).
Several research efforts have focused on different life stages of zooplankton, such as eggs, nauplii, copepodites and adults (Wroblewski, 1982; Hofmann and Ambler, 1988). Individualbased models (IBMs) have also been developed to track the lifestage cohorts of organisms. For example, for the life history of copepods, state variables of IBMs can include eggs, 6 stages of nauplii, 5 stages of copepodites and adults (Carlotti, and Sviandra 1989; Carlotti and Nival, 1992). Importantly, IBMs can be stage, age, size, or weightstructured (Bryant et al., 1997; Heath et al., 1998). For ecological and biogeochemical studies, IBMs are often coupled with population dynamic models in order to quantified energy flows between different trophic levels (Carlotti et al., 1996).
3. STRUCTURE
The structure of the generalized biological model is illustrated in Fig. 1. There are 7 functional groups of state variables: N_{i} (nutrients), P_{i} (phytoplankton), Z_{i} (zooplankton), D_{i} (detritus), DOM_{i} (dissolved organic matter, e.g., DOC or DON), B_{i} (bacterioplankton) and A_{i} (auxiliary state variables). The total number of components is flexible and varies for each functional group, i.e., nn components for nutrients, np for phytoplankton, nz for zooplankton, nd for biogenic detritus, ns for dissolved organic matter, nb for bacteria and na for auxiliary state variables (Fig. 1). The number of components (state variables) in each functional group can be freely assigned from 0 to an any large number. In practice however, we expect that these numbers will range from 0 to an order of 10.
The biological and biogeochemical pools corresponding to the symbolic state variables are also allowed to be freely designed. From the practical point of view, we expect that the biological and biogeochemical pools which can be quantified by measurements will be often more useful because the model can then be effectively compared to and constrained by field data. Nutrient (1) and nutrient (2) designate nitrate and ammonium, respectively. Additional nutrients and micronutrients can be assigned to number 3 or higher, e.g., silicate, phosphorus and iron. All other functional groups are designed to be classified by users for a specific application. For example, phytoplankton (P_{i}) can range from picophytoplankton through nanophytoplankton and microphytoplankton to macrophytes. P_{i} can also represent autotrophic cyanobacteria, dinoflagellates and diatoms, or small and large phytoplankton. Zooplankton can range from protozoa through ciliates and copepods to net mucous feeders, or represent different stages and different weights of zooplankton. Detritus is usually classified by size, but it can also be grouped otherwise, e.g., according to chemical composition. DOM is usually divided by chemical composition (DOC, DON, …). For example, in modeling studies, Vallino (2000) and Spitz et al. (2001) have used DOC and DON. Alternatively, DOM classification can be done at the molecule level (e.g., high and low molecular weight DOM; Amon et al., 1994; Gardner et al., 1996). For biogeochemical cycle studies, the bioavailability of DOM is important. Theoretically, DOM can then be considered as a continuum from labile to refractory dissolved organic matter and the number of DOM categories (ns) can be large. Recently, DOM has been separated into two or three categories, labile, semilabile and refractory DOM, both in data analyses (Carlson and Ducklow, 1995) and 1D/mesoscom modeling studies ( Anderson and Williams, 1999; Vallino, 2000; Tian et al., 2004). However, these categories reflect bioavailability and as such have limited use in a generalized model since they cannot be either routinely quantified with existing analytical methods (data collection) or approximated with confidence. Bacterial compartments are designed to reflect the response to environmental factors such as temperature tolerance (Delile and Perret, 1989). Bacteria are assumed to have the same role in trophic dynamics, i.e., acquiring energy from DOM and being consumed by small or mucousnetfeeding zooplankton. Auxiliary state variables represent oceanic properties that are not independent components in food web structure and trophic dynamics, and their field depends on other biological pools (e.g. chlorophyll, bioluminescence, optics, acoustic properties, dissolved oxygen, CO_{2}, etc).
In order that the model be flexible, energy flows between food web components are not a priori determined. All zooplankton can feed on all phytoplankton and all zooplankton. The specific energy flow between two biological pools will be determined by users by assigning a specific value to the corresponding preference coefficient (Fig. 2). For example, omnivorous and carnivorous copepods may have raptorial feeding with a narrowsize window of prey whereas mucousnet feeders, such as appendicularia, salps, and doliolids, can collect particles ranging from bacteria (0.25 m in diameter) to large diatoms (8x100 m; Madin and Deibel, 1997). Energy flows are quite different between these two types of zooplankton although they can be represented by the same symbol in different applications.
The basic unit (or currency) of the model varies with regard to specific applications. The commonly used unit in ecological and biogeochemical models is carbon either nitrogen. Conversion factors from one unit (e.g., carbon) to another (e.g., nitrogen) are provided in this generalized model by a specific elemental ratio for each trophic component.
