GENERATION OF NEAR-TERM G. ECHINULATA DENSITY HINDCASTS Detailed
methods can be found in Lofton et al. 20XX. Briefly, fourteen Bayesian
state-space models of varying complexity and including different
environmental covariates were calibrated using environmental driver
data and G. echinulata density data at a nearshore site (South Herrick
Cove) in Lake Sunapee from May-October in 2009-2014, and then
validated by generating one-week-ahead to four-week-ahead hindcasts of
G. echinulata density from May-October in 2015-2016. For hindcasts, G.
echinulata and environmental driver data were assimilated weekly by
re-running the model calibration each week to obtain updated estimates
of model parameters and latent states. These updated posteriors were
then used to run the model forward in time four weeks to generate G.
echinulata density hindcasts. Environmental driver data were
hindcasted using draws from an ensemble of historical values during
2009-2014 for the 2015 hindcasts, and from 2009-2015 for the 2016
hindcasts.
Hindcasts were generated under several different conditions to allow
for subsequent uncertainty partitioning of the total hindcast variance
and calculation of credible and predictive intervals. The following
sources of uncertainty were considered: initial conditions
uncertainty, parameter uncertainty, driver data uncertainty, process
uncertainty, and observation uncertainty. First, hindcasts were
generated including only initial conditions uncertainty, or
uncertainty in the latent state of G. echinulata. Next, we added in
parameter uncertainty, or uncertainty in the value of model
parameters. After that, we added in driver uncertainty, or uncertainty
in the hindcasted value of environmental covariates. Finally, we added
in process uncertainty, or uncertainty due to stochasticity, error in
model structure, or numerical rounding error during the course of a
model run. Together, these four sources of uncertainty (initial
conditions, parameter, driver, and process) constitute the hindcast
credible interval. We also generated hindcasts that included all of
the aforementioned uncertainty sources plus observation error to be
able to generate predictive intervals for use when comparing model
hindcasts to observational data.
NAMING CONVENTION FOR HINDCAST FILES Within the provided .zip file
(Gechinulata_hindcasts.zip), there are 2680 .csv files, each of which
corresponds to a hindcast generated by one model, initiated during a
particular week of the 2015-2016 sampling season, and including a
specified subset of uncertainty sources. An example hindcast file with
associated table metadata is also provided separately
(AR_IC.Pa.P.O_2015-05-14_example.csv). The following naming convention
was used:
ModelName_uncertainty.sources_YYYY-MM-DD.csv
ModelName indicates one of the following models used to generate the
hindcast:
RW, AR, MinWaterTemp, MinWaterTempLag, WaterTempMA, DeltaSchmidt,
SchmidtLag, WindDir, Precip, GDD, SchmidtAndTemp, TempAndPrecip,
SchmidtAndPrecip, PrecipAndGDD.
Descriptions of the structure for each model can be found in Table 1
of Lofton et al. 20XX.
uncertainty.sources indicates the combination of uncertainty sources
that are incorporated in the hindcast file, according to the following
codes:
IC = initial conditions; Pa = parameter; D = driver; P = process; O =
observation
Note that not model structures contain all sources of uncertainty. For
example, a random walk or intercept model does not have driver
uncertainty because it does not include any environmental covariates.
YYYY-MM-DD indicates the date for which the hindcast was generated.
Hindcasts run from one to four weeks into the future from the date for
which they were generated. For example, a hindcast file generated for
the week of 2015-05-25 will include a one-week-ahead forecast for the
week of 2015-06-01, a two-week-ahead forecast for the week of
2015-06-08, a three-week-ahead forecast for the week of 2015-06-15,
and a four-week-ahead forecast for the week of 2015-06-22.
DOWNLOAD AND PROCESSING OF NLDAS-2 DATA The NLDAS-2 database
(https://ldas.gsfc.nasa.gov/nldas/v2/forcing) was accessed in February
2017 and data were downloaded from January 1, 1979 through December
31, 2016 at the hourly scale, including shortwave radiation. Detailed
definitions and descriptions of the NLDAS-2 forcing variables can be
found at https://ldas.gsfc.nasa.gov/nldas/v2/forcing. Lake Sunapee
spans four 1/8th-degree grid cells within the NLDAS grid system and
these grid cells were queried for download using a Lake Sunapee
shapefile. Observations for each meteorological variable were
subsequently averaged across grid cells to provide a single value for
the lake at each hourly timestep.
Hourly solar radiation values were subsequently summarized to daily
mean, median, maximum, minimum, standard deviation, and sum for each
day of G. echinulata sampling from 2009-2016.
DOWNLOAD AND PROCESSING OF PRISM DATA The PRISM database
(http://www.prism.oregonstate.edu/documents/PRISM_datasets.pdf) was
accessed on October 4, 2018, and data from the AN81d dataset were
downloaded from January 1, 1981 through December 31, 2017, including
precipitation, with grid cell interpolation on. Detailed definitions
and descriptions of PRISM datasets can be found at
http://www.prism.oregonstate.edu/documents/PRISM_datasets.pdf. Data
were downloaded for a location corresponding to a Gloeotrichia
echinulata monitoring site on Lake Sunapee (Lat: 43.4098, Lon:
-72.0367, Elev: 361m). Data were downloaded at the daily timestep,
which PRISM defines as the 24 hour period ending at 1200 UTC on the
day entered in the Date column of the dataframe.
Precipitation data were subsequently summarized to include daily sum
of precipitation on each of day of G. echinulata sampling from
2009-2016, as well as daily sum of precipitation on the day prior to
the day of sampling (precip_mm_1daylag) and daily sum of precipitation
on the previous G. echinulata sampling day (precip_mm_1weeklag).
CITATIONS Lofton, M.E., Brentrup, J.A., Beck, W.S., Zwart, J.A.,
Bhattacharya, R., Brighenti, L.S., Burnet, S.H., McCullough, I.M.,
Steele, B.G., Carey, C.C., Cottingham, K.L., Dietze, M.C., Ewing,
H.A., Weathers, K.D., LaDeau, S.L. 20XX. Using near-term forecasts and
uncertainty partitioning to prioritize research for understanding
cyanobacterial dynamics. Journal, Volume, Issue, Pages.