1. Introduction
To create an integrated database of discrete water quality
measurements in the San Francisco Estuary, we combined data from 15
boat-based surveys with the R statistical programming language (R
Core Team 2020). The data integration code was packaged into the R
package discretewq v2.3.1: https://github.com/sbashevkin/discretewq
(Bashevkin et al. 2022).
The surveys included in the integrated database are long-term
monitoring surveys managed by federal agencies, state agencies, and
the University of California, Davis. Eight surveys are primarily
focused on collecting fish abundance data but collect water quality
data alongside fish samples. These include the California Department
of Fish and Wildlife (CDFW) Fall Midwater Trawl (FMWT), CDFW Summer
Townet Survey (STN), CDFW Spring Kodiak Trawl (SKT), CDFW 20-mm
Survey (20mm), CDFW San Francisco Bay Study (Baystudy), CDFW Smelt
Larva Survey (SLS), California Department of Water Resources (DWR)
Yolo Bypass Fish Monitoring Program (YBFMP), United States Fish and
Wildlife Service (USFWS) Enhanced Delta Smelt Monitoring (EDSM),
USFWS Delta Juvenile Fish Monitoring Program (DJFMP), and University
of California, Davis Suisun Marsh Fish Study (Suisun). An additional
4 surveys are primarily focused on water quality data: the DWR
Environmental Monitoring Program (EMP), DWR Stockton Dissolved
Oxygen Survey (SDO), United States Bureau of Reclamation Sacramento
Deepwater Shipping Channel Survey (USBR), the United States
Geological Survey (USGS) San Francisco Bay Survey (USGS_SFBS), and
the USGS California Water Science Center monitoring (USGS_CAWSC)
(see Delta_Integrated_WQ_metadata.csv).
The primary aim of this data integration was to combine datasets to
facilitate analyses of water quality trends in the upper San
Francisco Estuary. The focal water quality variables included water
temperature, conductivity (or salinity), Secchi depth,
Microcystis concentration, and chlorophyll
concentration. Key nutrient variables were retained from the
USGS_SFBS, USGS_CAWSC, and EMP surveys. These variables were all
collected from the surface of the water column. In addition, water
temperature from the bottom of the water column was retained when
available. Not all surveys measured all focal variables. Some
surveys (particularly the water quality surveys) measured more water
quality variables than were retained in this integrated dataset.
While we describe some of the methods here, it is highly recommended
to inspect the documentation of the component surveys (see
provenance and below for citations) for more information on their
methods.
2. Survey methods
Methods for measuring water quality variables were generally
consistent among the component surveys, but there were slight
differences. All surface water samples were collected within the
upper 1 m, but the exact depth differed slightly among studies.
USGS_SFBS collected some surface temperatures at depths of 2 m, but
we only retained samples collected at 1 m or shallower for
compatibility with the other studies. The only exception to this is
for nutrient data collected by the USGS_SFBS survey. Nutrient
samples were sometimes collected deeper than the surface water
quality data. In these cases, we selected the shallowest nutrient
data available. The maximum depth of surface nutrient data is 4 m
and these depths are available in the dataset. Bottom temperature
samples were collected within 1 m of the bottom (see
Delta_Integrated_WQ_metadata.csv). More detailed methods and
protocols for most component surveys can be found in the data source
links in Delta_Integrated_WQ_metadata.csv or the provenance
citations.
2.1. Water temperature
While all surveys now measure water temperature with digital
sensors, older surveys used less precise handheld thermometers in
earlier years. More precise sensors were first used by FMWT in 1995,
STN in 1994, and DJFMP in 2014. All other surveys used more precise
methods to measure temperature since inception. SKT had notes on
some temperature records that they were transcribed from a different
monitoring program (CDEC) so these values were all removed.
2.2. Conductivity/Salinity
Most surveys reported specific conductivity except USGS_SFBS which
reported salinity. DJFMP and EDSM could not verify their
conductivity metric for data collected before June 2019 so
conductivity values collected before that date are removed from the
integrated dataset.
2.3. Secchi depth
Secchi depth was measured on the shady side of the boat (when
possible) in all surveys that measured this variable. It is
important to note that the Secchi data are right-censored, since in
some cases the disk was still visible at the deepest depth to which
it could be extended. In these cases, the maximum extension depth
was usually recorded, even if the disk was still visible.
2.4. Microcystis
Concentration of the toxic microalga
Microcystis was measured on the same 5-point
qualitative scale (absent, low, medium, high, very high) by the 3
surveys that measured this variable. For a short period of time
(2012-15), FMWT added a 6th level to the
Microcystis scale to represent
Microcystis presence in zooplankton net
cod-ends. Outside this short time period, this was measured as a
“low” on the 5-point scaled, so all records of this
6th level were converted to “low” for
consistency with other surveys and time periods.
2.5. Chlorophyll
Chlorophyll-a methods differed slightly among surveys. EMP filtered
water samples through a 1 µm glass fiber filter and measured
Chlorophyll concentrations in the lab. USBR and USGS_SFBS used sonde
probes to measure chlorophyll in the field but USGS_SFBS calibrated
these field measurements with filtered water samples collected and
analyzed similar to EMP.
2.6. Nutrients
EMP collected and preserved nutrients samples in accordance with
standard protocols (Interagency Ecological Program et al. 2021a),
after which they were processed in a lab. Nutrients were filtered in
the field when applicable. USGS_SFBS collected, preserved, and
processed dissolved inorganic nutrients in a similar manner to EMP.
Both surveys collected water using a fixed flow-through pump.
3. Data integration methods
From each dataset, we selected columns corresponding to the water
quality variables of interest as well as important accessory
information (date, time, station, latitude, longitude, depth, tide,
and any notes). We then renamed variables for consistency and
converted all variables to consistent units. Salinity was calculated
from specific conductivity using the ec2pss function from the wql R
package (Jassby et al. 2017). This function uses the Practical
Salinity Scale 1978 for salinities between 2 and 42 (Fofonoff and
Millard Jr 1983) and the extension of the Practical Salinity Scale
(Hill et al. 1986) for salinities below 2. Conductivity data were
also retained in the integrated dataset. In most cases, latitude and
longitude coordinates of the fixed sampling stations were retained.
When these coordinates were not available (e.g. for non-fixed
stations), we retained any coordinates that were recorded during the
field sampling. To remove duplicate values from the dataset, only
one set of values was retained for each recorded date, time, and
location. All data integration code can be found in the discretewq R
package v2.3.1 (https://github.com/sbashevkin/discretewq; Bashevkin
et al. 2022).
4. Literature cited
Bashevkin, S. M., S. E. Perry, and E. B. Stumpner. 2022. discretewq:
An Integrated Dataset of Discrete Water Quality in the San Francisco
Estuary v2.3.1. Zenodo. doi:10.5281/zenodo.6335814
Fofonoff, N. P., and R. C. Millard Jr. 1983. Algorithms for the
computation of fundamental properties of seawater. UNESCO Technical
Papers in Marine Science 44.
Hill, K., T. Dauphinee, and D. Woods. 1986. The extension of the
Practical Salinity Scale 1978 to low salinities. IEEE Journal of
Oceanic Engineering 11: 109–112.
Jassby, A. D., J. E. Cloern, and J. Stachelek. 2017. wql: Exploring
Water Quality Monitoring Data.
R Core Team. 2020. R: A Language and Environment for Statistical
Computing, R Foundation for Statistical Computing.
5. Data sources
CDFW. 2021a. Fall Midwater Trawl data.
https://filelib.wildlife.ca.gov/Public/TownetFallMidwaterTrawl/FMWT%20Data/.
CDFW. 2021b. Summer Townet data.
https://filelib.wildlife.ca.gov/Public/TownetFallMidwaterTrawl/TNS%20MS%20Access%20Data/TNS%20data/.
CDFW. 2021c. Bay Study data.
https://filelib.wildlife.ca.gov/Public/BayStudy/.
Cloern, J. E., and T. S. Schraga. 2016. USGS Measurements of Water
Quality in San Francisco Bay (CA), 1969-2015 (ver. 3.0 June 2017).
U. S. Geological Survey data release.
doi:https://doi.org/10.5066/F7TQ5ZPR
Interagency Ecological Program, L. Damon, and A. Chorazyczewski.
2021a. Interagency Ecological Program San Francisco Estuary 20mm
Survey 1995 - 2021. ver 4. Environmental Data Initiative.
doi:10.6073/pasta/32de8b7ffbe674bc6e79dbcd29ac1cc2
Interagency Ecological Program, L. Damon, and A. Chorazyczewski.
2021b. Interagency Ecological Program San Francisco Estuary Spring
Kodiak Trawl Survey 2002 - 2021. ver4. Environmental Data
Initiative. doi:10.6073/pasta/f0e2916f4a026f3f812a0855cee74a8d
Interagency Ecological Program, L. Damon, T. Tempel, and A.
Chorazyczewski. 2021c. Interagency Ecological Program San Francisco
Estuary Smelt Larva Survey 2009 – 2021. ver 4. Environmental Data
Initiative. doi:10.6073/pasta/8e1ceb1c02fbc8b0ba7a6b58229109f2
Interagency Ecological Program, S. Lesmeister, and J. Rinde. 2020a.
Interagency Ecological Program: Discrete dissolved oxygen monitoring
in the Stockton Deep Water Ship Channel, collected by the
Environmental Monitoring Program, 1997-2018. ver 2. Environmental
Data Initiative. doi:10.6073/PASTA/3268530C683726CD430C81894FFAD768
Interagency Ecological Program, M. Martinez, and S. Perry. 2021d.
Interagency Ecological Program: Discrete water quality monitoring in
the Sacramento-San Joaquin Bay-Delta, collected by the Environmental
Monitoring Program, 1975-2020. ver 4. Environmental Data Initiative.
doi:10.6073/pasta/31f724011cae3d51b2c31c6d144b60b0
Interagency Ecological Program, R. McKenzie, J. Speegle, A.
Nanninga, and J. Hagen. 2021e. Interagency Ecological Program: Over
four decades of juvenile fish monitoring data from the San Francisco
Estuary, collected by the Delta Juvenile Fish Monitoring Program,
1976-2021. ver 8. Environmental Data Initiative.
doi:10.6073/pasta/8dfe5eac4ecf157b7b91ced772aa214a
Interagency Ecological Program, C. L. Pien, J. B. Adams, and N.
Kwan. 2020b. Interagency Ecological Program: Zooplankton catch and
water quality data from the Sacramento River floodplain and tidal
slough, collected by the Yolo Bypass Fish Monitoring Program,
1998-2018. ver 2. Environmental Data Initiative.
doi:10.6073/pasta/baad532af96cba1d58d43b89c08ca081
Interagency Ecological Program, B. Schreier, B. Davis, and N.
Ikemiyagi. 2019. Interagency Ecological Program: Fish catch and
water quality data from the Sacramento River floodplain and tidal
slough, collected by the Yolo Bypass Fish Monitoring Program,
1998-2018. ver 2. Environmental Data Initiative.
doi:10.6073/PASTA/B0B15AEF7F3B52D2C5ADC10004C05A6F
O’Rear, T., J. Durand, and P. Moyle. 2021. Suisun Marsh Fish Study.
https://watershed.ucdavis.edu/project/suisun-marsh-fish-study.
Schraga, T. S., E. S. Nejad, C. A. Martin, and J. E. Cloern. 2018.
USGS measurements of water quality in San Francisco Bay (CA),
beginning in 2016 (ver. 3.0, March 2020). U. S. Geological Survey
data release. doi:https://doi.org/10.5066/F7D21WGF
United States Fish And Wildlife Service, T. Senegal, R. Mckenzie,
and others. 2021. Interagency Ecological Program and US Fish and
Wildlife Service: San Francisco Estuary Enhanced Delta Smelt
Monitoring Program data, 2016-2021. ver 7. Environmental Data
Initiative. doi:10.6073/pasta/65f9297a7077320f4ba31c2acd685f93
USBR, R. Dahlgren, L. Loken, and E. Van Nieuwenhuyse. 2020. Monthly
vertical profiles of water quality in the Sacramento Deep Water Ship
Channel 2012-2019.
U.S. Geological Survey. 2022. USGS water data for the Nation: U.S.
Geological Survey National Water Information System database,
accessed February 7, 2022, at https://doi.org/10.5066/F7P55KJN