Data Package Metadata   View Summary

National-scale, remotely sensed lake trophic state (LTS-US) 1984-2020

General Information
Data Package:
Local Identifier:edi.78.2
Title:National-scale, remotely sensed lake trophic state (LTS-US) 1984-2020
Alternate Identifier:DOI PLACE HOLDER
Abstract:

Lake trophic state is a key water quality property that integrates a lake’s physical, chemical, and biological processes. Despite the importance of trophic state as a gauge of lake water quality, standardized and machine readable observations are uncommon. Remote sensing presents an opportunity to detect and analyze lake trophic state with reproducible, robust methods across time and space. We used Landsat surface reflectance and lake morphometric data to create the first compendium of lake trophic state for more than 56,000 lakes of at least 10 ha in size throughout the contiguous United States from 1984 through 2020. The dataset was constructed with FAIR data principles (Findable, Accessible, Interoperable, and Reproducible) in mind, where data are publicly available, relational keys from parent datasets are retained, and all data wrangling and modeling routines are scripted for future reuse. Together, this resource offers critical data to address basic and applied research questions about lake water quality at a suite of spatial and temporal scales.

Publication Date:2023-03-14
For more information:
Visit: DOI PLACE HOLDER

Time Period
Begin:
1984
End:
2020

People and Organizations
Contact:Meyer, Michael F (U.S. Geological Survey, Research Geographer) [  email ]
Creator:Meyer, Michael F (U.S. Geological Survey, Research Geographer)
Creator:Topp, Simon N (U.S. Geological Survey, Research Physical Scientist)
Creator:King, Tyler V (U.S. Geological Survey, Hydrologist)
Creator:Ladwig, Robert (Center for Limnology, Post Doctoral Researcher)
Creator:Pilla, Rachel M (Oak Ridge National Lab)
Creator:Dugan, Hilary A (Center for Limnology, Associate Professor)
Creator:Eggleston, Jack R (U.S. Geological Survey, Branch Chief, Hydrologic Remote Sensing Branch)
Creator:Hampton, Stephanie E (Carnegie Institution for Science, Deputy Director)
Creator:Leech, Dina M (Longwood University, Associate Professor)
Creator:Oleksy, Isabella A (University of Wyoming, Post Doctoral Researcher)
Creator:Ross, Jesse C (U.S. Geological Survey)
Creator:Ross, Matthew RV (Colorado State University)
Creator:Woolway, R Iestyn (Bangor University, Assistant Professor)
Creator:Yang, Xiao (Southern Methodist University, Assistant Professor)
Creator:Brousil, Matthew R (Colorado State University)
Creator:Fickas, Kate C (U.S. Geological Survey)
Creator:Padowski, Julie C (Washington State University)
Creator:Pollard, Amina I (U.S. Environmental Protection Agency)
Creator:Ren, Jianning (University of Nevada - Reno, Post Doctoral Researcher)
Creator:Zwart, Jacob A (U.S. Geological Survey, Research Data Scientist)

Data Entities
Data Table Name:
ensemble_predictions
Description:
Tabular dataset of mean probabilities of a given lake and year being of a given trophic state. A categorical variable (categorical_ts) defines which mean probability is highest.
Data Table Name:
individual_predictions
Description:
Tabular dataset of individual predictions for each lake trophic state and for each modeling technique.
Other Name:
README_targets
Description:
Instructions for setting up and running all supporting R code in the targets pipeline framework
Other Name:
README_container
Description:
README for using the Docker Container with the targets pipeline
Other Name:
scripts
Description:
Scripts associated with production of the data product
Other Name:
lts_container
Description:
Docker container for compute environment. This docker container allows future users to recreate the exact compute environment, where all datasets and quality control products were generated.
Other Name:
lake_trophic_status_docker_image
Description:
Pre-built docker image for running LTS-US pipeline in a containerized environment. Please see "README_container.pdf" for instructions on running the pipeline in a Docker container.
Detailed Metadata

Data Entities


Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/2408a538ea395572153e16b189ccacc6
Name:ensemble_predictions
Description:Tabular dataset of mean probabilities of a given lake and year being of a given trophic state. A categorical variable (categorical_ts) defines which mean probability is highest.
Number of Records:2059494
Number of Columns:9

Table Structure
Object Name:ensemble_predictions.csv
Size:254774297 byte
Authentication:580e66762b392bb0ed528513d5748b7e Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 Hylak_idyearmean_prob_dysvar_prob_dysmean_prob_eumixovar_prob_eumixomean_prob_oligovar_prob_oligocategorical_ts
Column Name:Hylak_id  
year  
mean_prob_dys  
var_prob_dys  
mean_prob_eumixo  
var_prob_eumixo  
mean_prob_oligo  
var_prob_oligo  
categorical_ts  
Definition:HydroLAKES unique identifier of lake. Preserved from HydroLAKES input data to enable future merge with HydroLAKES attributes.Year, spans 1984 through 2020.Probability that a lake-year combination is dystrophic. Probability is calculated by averaging probabilities from multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods. Variance in probabilities among multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods that a given lake-year is dystrophic. Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by averaging probabilities from multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods. Variance in probabilities among multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods that a given lake-year is eutrophic or mixotrophic. Probability that a lake-year combination is oligotrophic. Probability is calculated by averaging probabilities from multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods.Variance in probabilities among multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods that a given lake-year is oligotrophic.Categorical predicted lake trophic status (i.e., oligotrophic, eutrophic_mixotrophic, dystrophic). Categorical prediction is based on the highest probability among mean_prob_dys, mean_prob_eumixo, and mean_prob_oligo.
Storage Type:float  
float  
float  
float  
float  
float  
float  
float  
string  
Measurement Type:ratioratioratioratioratioratioratiorationominal
Measurement Values Domain:
Unitunitless
Typereal
Min
Max1069952 
Unityear
Typeinteger
Min1984 
Max2020 
Unitunitless
Typereal
Min0.004105521 
Max0.8159664 
Unitunitless
Typereal
Min6.126111e-10 
Max0.1707376 
Unitunitless
Typereal
Min0.01323732 
Max0.987529 
Unitunitless
Typereal
Min7.564333e-10 
Max0.1328802 
Unitunitless
Typereal
Min0.005209238 
Max0.9684982 
Unitunitless
Typereal
Min1.558369e-09 
Max0.1790787 
Allowed Values and Definitions
Enumerated Domain 
Code Definition
CodeNA
DefinitionNo prediction because missing appropriate satellite data for a given lake-year combination
Source
Code Definition
Codedys
DefinitionLake-year combination classified as dystrophic
Source
Code Definition
Codeeu/mixo
DefinitionLake-year combination classified as eutrophic and/or mixotrophic
Source
Code Definition
Codeoligo
DefinitionLake-year combination classified as oligotrophic
Source
Missing Value Code:
CodeNA
ExplMissing value
CodeNA
ExplMissing value
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
Accuracy Report:                  
Accuracy Assessment:                  
Coverage:                  
Methods:                  

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/95b484f3bf81cbe035e27ee51df4df32
Name:individual_predictions
Description:Tabular dataset of individual predictions for each lake trophic state and for each modeling technique.
Number of Records:2059494
Number of Columns:11

Table Structure
Object Name:individual_predictions.csv
Size:337130812 byte
Authentication:0494d66398845c1ab4c8f753141de732 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 Hylak_idyearprob_dys_mlrprob_eumixo_mlrprob_oligo_mlrprob_dys_mlpprob_eumixo_mlpprob_oligo_mlpprob_dys_xgbprob_eumixo_xgbprob_oligo_xgb
Column Name:Hylak_id  
year  
prob_dys_mlr  
prob_eumixo_mlr  
prob_oligo_mlr  
prob_dys_mlp  
prob_eumixo_mlp  
prob_oligo_mlp  
prob_dys_xgb  
prob_eumixo_xgb  
prob_oligo_xgb  
Definition:HydroLAKES unique identifier of lake. Preserved from HydroLAKES input data to enable future merge with HydroLAKES attributes.Year, spans 1984 through 2020.Probability that a lake-year combination is dystrophic. Probability is calculated by multinomial, multiple logistic regression. Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by multinomial, multiple logistic regression.Probability that a lake-year combination is oligotrophic. Probability is calculated by multinomial, multiple logistic regression. Probability that a lake-year combination is dystrophic. Probability is calculated by multilayer perceptron. Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by multilayer perceptron. Probability that a lake-year combination is oligotrophic. Probability is calculated by multilayer perceptron. Probability that a lake-year combination is dystrophic. Probability is calculated by a gradient-boosted regression tree.Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by a gradient-boosted regression tree. Probability that a lake-year combination is oligotrophic. Probability is calculated by a gradient-boosted regression tree.
Storage Type:float  
float  
float  
float  
float  
float  
float  
float  
float  
float  
float  
Measurement Type:ratioratioratioratioratioratioratioratioratioratioratio
Measurement Values Domain:
Unitunitless
Typereal
Min
Max1069952 
Unityear
Typeinteger
Unitunitless
Typereal
Min2.815192e-10 
Max0.9890227 
Unitunitless
Typereal
Min1.577883e-10 
Max0.9999986 
Unitunitless
Typereal
Min1.002697e-11 
Max0.999946 
Unitunitless
Typereal
Min3.519079e-08 
Max0.9939494 
Unitunitless
Typereal
Min0.0001922453 
Max
Unitunitless
Typereal
Min1.925952e-11 
Max0.9991604 
Unitunitless
Typereal
Min0.01076017 
Max0.817612 
Unitunitless
Typereal
Min0.03078391 
Max0.9701928 
Unitunitless
Typereal
Min0.01226485 
Max0.945454 
Missing Value Code:
CodeNA
ExplMissing value
CodeNA
ExplMissing value
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
CodeNA
ExplMissing value due to lack of remote sensing data for a given lake-year combination
Accuracy Report:                      
Accuracy Assessment:                      
Coverage:                      
Methods:                      

Non-Categorized Data Resource

Name:README_targets
Entity Type:pdf
Description:Instructions for setting up and running all supporting R code in the targets pipeline framework
Physical Structure Description:
Object Name:README_targets.pdf
Size:309691 byte
Authentication:39f88d2b7f453e1f9447ab91a0ff4668 Calculated By MD5
Externally Defined Format:
Format Name:pdf
Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/7e82333274c569dd5237cb19d9589b7f

Non-Categorized Data Resource

Name:README_container
Entity Type:pdf
Description:README for using the Docker Container with the targets pipeline
Physical Structure Description:
Object Name:README_container.pdf
Size:58824 byte
Authentication:693c904d49d434d40fa0ca0dd3b39f43 Calculated By MD5
Externally Defined Format:
Format Name:pdf
Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/60b2b77971451b04f4af928171bea459

Non-Categorized Data Resource

Name:scripts
Entity Type:zipped folder
Description:Scripts associated with production of the data product
Physical Structure Description:
Object Name:scripts.zip
Size:31603 byte
Authentication:f4e3fa1636a8150ad19c76414d112344 Calculated By MD5
Externally Defined Format:
Format Name:zip
Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/d6c5855a62cf32a4dadbc2831f0f295f

Non-Categorized Data Resource

Name:lts_container
Entity Type:zipped folder
Description:Docker container for compute environment. This docker container allows future users to recreate the exact compute environment, where all datasets and quality control products were generated.
Physical Structure Description:
Object Name:lts_container.zip
Size:4047 byte
Authentication:f59648229cf6416dc5f16e235256499c Calculated By MD5
Externally Defined Format:
Format Name:zip
Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/d798db512884c21859258e8dcf36b98d

Non-Categorized Data Resource

Name:lake_trophic_status_docker_image
Entity Type:compressed tape archive
Description:Pre-built docker image for running LTS-US pipeline in a containerized environment. Please see "README_container.pdf" for instructions on running the pipeline in a Docker container.
Physical Structure Description:
Object Name:lake_trophic_status_docker_image.tar.gz
Size:3738785336 byte
Authentication:c1994ad28775b2d868256a91295b63b7 Calculated By MD5
Externally Defined Format:
Format Name:tar.gz
Data:https://pasta-s.lternet.edu/package/data/eml/edi/78/2/ab7b0d7f5f161515d0725aaf034afd94

Data Package Usage Rights

This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you.

Keywords

By Thesaurus:
LTER Controlled Vocabularylakes, water quality, remote sensing, limnology, ecology
(No thesaurus)synthesis

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package
Description:

The LTS-US dataset is constructed in a three-part pipeline: (1) aggregate training data, (2) create classification models, and (3) apply predictions to lakes outside of the training data. Individual steps within the pipeline are described below.

Step 1: Identify Parent Datasets

United States Environmental Protection Agency National Lakes Assessment

In situ measurements of total phosphorus and true color were compiled from the U.S. National Lakes Assessment (NLA), a synoptic sampling campaign of lakes, ponds, and reservoirs, hereafter collectively referred to as “lakes”, conducted in the contiguous U.S. every five years. Lakes used in this analysis were sampled in the summer (June -September) of 2007 (n = 1028), 2012 (n = 1038), or 2017 (n = 1005). Lakes were selected from the National Hydrography Dataset (NHD, https://www.usgs.gov/national-hydrography/national-hydrography-dataset) using a randomized design stratified on aggregated Omernik level III ecoregion and lake surface area. The minimum surface area for inclusion in the 2007 assessment was 4 ha, but, owing to increasing resolution in the NHD, was reduced to 1 ha for the 2012 and 2017 assessments. Natural lakes and reservoirs were treated equally for site selection purposes. A wide set of measurements were collected at each sampled lake, but we only provide details on the variables used in this analysis. Additional details, protocols, and data are available online (https://www.epa.gov/national-aquatic-resource-surveys/nla).

Total phosphorus and true color were collected and processed in the 2007, 2012, and 2017 field campaigns (USEPA 2007, 2011, 2017a). Water samples were collected from a deep, open water (up to 50 m deep) location in natural lakes and at a midpoint in reservoirs. Water was collected from the photic zone using a vertical, depth integrated sampling device. True color was estimated by visual comparison of filtered water samples to a calibrated glass color disk (USEPA 1987). Total phosphorus concentrations were measured with manual alkaline persulfate digestion, followed by automated colorimetric analysis (ammonium molybdate and antimony potassium tartrate under acidic conditions, with absorbance at 880 nm) using a flow injection analyzer following standard method 4500-P-E (APHA 1999). Detailed descriptions of all water quality analyses are available in the NLA Laboratory Operations Manuals (USEPA 2007, 2012, 2017b).

HydroLAKES

HydroLAKES (Messager et al. 2016) is a compendium of more than 1.4 million lake and reservoir shapefiles globally, with surface area of at least 10 ha. For an individual waterbody, HydroLAKES contains its spatial extent and location (using georeferenced polygons), a unique identifier (ranging from 1 to 1,427,688), and its morphological (area, mean depth, elevation, shoreline length etc.), hydrological (e.g., residence time, discharge, watershed area, watershed area), and geographical (e.g., name, country, continent) properties. HydroLAKES is a compilation of existing lake databases, with sources from government agencies (e.g., Natural Resources Canada, U.S. Geological Survey, European Environment Agency) and from remote sensing studies (for example, Shuttle Radar Topographic Mission Water Body Data, Global Lakes and Wetlands Database, and Global Reservoir and Dam Database). Most of the lake polygons are sourced from the Shuttle Radar Topographic Mission Water Body Data for regions between 60ºS and 60ºN (Robinson et al. 2014), supplemented by other datasets for higher latitudes and for underrepresented regions. More detailed information on the creation and validation of the HydroLAKES dataset can be found in Messager et al. (2016).

LimnoSat

The LimnoSat-US (Topp et al. 2020) dataset comprises over 22 million remotely sensed observations of lake surface reflectance from 1984 to 2020. Observations cover over 50,000 lakes greater than 10 ha (Messager et al. 2016) aggregated from Landsat 5, 7, and 8 Collection 1 imagery. Each observation was calculated by taking the median surface reflectance within 120 meters of each lake’s Chebyshev center, defined as the point farthest from shore and usually is located at the lake’s deepest point (Shen et al. 2015). Extracting reflectance values from the Chebyshev center minimizes signals due to bottom reflectance and adjacent land pixels. For each observation, non-high confidence water pixels were masked using the Dynamic Surface Water Extent algorithm (Jones 2019). Observations were removed if the scene cloud cover was greater than 75%, any snow, ice, cloud, cloud shadow (Foga et al. 2017), or hillshade was detected over the lake’s Chebyshev center, or if there were fewer than eight high confidence water pixels within the 120 meter buffer of the lake’s Chebyshev center. For certain lakes, these filters lead to extended periods with limited observations. Data in LimnoSat-US are presented in a tabular format, where each row reflects a Landsat overpass for a given waterbody, and columns include median Collection 1 surface reflectance values by band extracted from pixels within 120 m of the Chebyshev center, scene wide cloud cover, date of imagery acquisition, and number of water pixels within 120 m of the Chebyshev center.

Step 2: Define Lake Trophic State

Many lakes across the United States are experiencing simultaneous changes in their water clarity, with some lakes getting greener due to eutrophication, and others getting browner from increasing terrestrially-derived organic matter, and some are simultaneously ‘greening’ and ‘browning’ (Leech et al. 2018). Given the need to discriminate between lakes that may be browning and/or greening, the Nutrient Color Paradigm (NCP) is a useful tool to assign LTS based on a lake’s characteristic color.

The NCP was initially proposed in the early 20th century, emphasizing that both autochthonous and allochthonous processes are important to understanding LTS (Naumann 1917; Thienemann 1921; Jӓrnefelt 1925). Specifically, water color often affects algal biomass and light transparency independent of nutrient availability. Rodhe (1969) first assembled the four quadrants of the NCP, placing autochthony on the horizontal axis and allochthony on the vertical axis. This second dimension distinguishes “oligotrophic” (low nutrient, low color) lakes, “eutrophic” (high nutrient, low color) lakes, “dystrophic” (low nutrient, high color) lakes, and “mixotrophic” (high nutrient, high color) lakes.

While metrics such as Trophic State Index (Carlson 1977) gained popularity for providing instantaneous assessments of a lake’s autotrophic production, Williamson et al. (1999) encouraged a focus on NCP for lake classification given the importance of both nutrients and dissolved organic matter to lake structure and function. The NCP’s implementation is empirically supported by studies like Webster et al. (2008), where an analysis of ~1,600 temperate lakes in North America demonstrated that within lakes grouped by total phosphorus concentration (i.e., oligotrophic, mesotrophic, or eutrophic), those with ‘browner’ color (indicative of dissolved organic matter) had higher volumetric chlorophyll-a concentrations and shallower Secchi disk depths. A similar pattern was observed by Nürnberg and Shaw (1998), which analyzed 600 lakes spanning a latitude of 39°S to 82°N.

Here, we used the thresholds published in Webster et al. (2008) to classify lakes in the NLA dataset. Lakes were described as oligotrophic or ‘blue’ if total phosphorus concentration was less than 30 μg/L and true color was less than 20 platinum cobalt units (PCU), eutrophic/mixotrophic or ‘green/murky’ if total phosphorus was greater than 30 μg/L, dystrophic or ‘brown’ if total phosphorus was less than 30 μg/L and true color greater than 20 PCU. Thresholds for total phosphorus are based on long established and widely accepted ranges affecting primary productivity (Wetzel 2001). True color thresholds are derived from Nürnberg and Shaw (1998).

Step 3: Create a training dataset

First, to create a dataset of lakes with in situ LTS measurements, we aggregated all total phosphorus and true color measurements from the US EPA NLA 2007, 2012, and 2017 data. While the NLA includes lakes smaller than 10 ha, we only used lakes of at least 10 ha in area, for consistency with the HydroLAKES database. We then assessed the extent to which seasonally shifting total phosphorus concentrations may alter interpretation of trophic state for a given lake using the subset of lakes that were sampled intra-annually. We calculated the percentage of lakes that transitioned between trophic states within a single year and found that lakes broadly remained in the same NCP trophic state throughout a given summer (85.1% of lakes). Of the lakes that changed trophic state during a sampling season (14.9%), the majority transitioned from oligotrophic (61.5% of changing lakes; 8.7% of all lakes) or dystrophic (15.4% of changing lakes; 2.2% of all lakes) to eutrophic/mixotrophic. Few lakes transitioned from oligotrophic to dystrophic (15.4% of changing lakes; 2.2% of all lakes), and even fewer transitioned to oligotrophic from either dystrophic (3.9% of changing lakes; 0.5% of all lakes) or eutrophic/mixotrophic (3.9% of changing lakes; 0.5% of all lakes). No lakes transitioned from eutrophic/mixotrophic to dystrophic across all three NLA campaigns. Broadly, lakes transitioned between trophic states when lakes were located near a threshold for trophic state delineation (15-45 μg/L total phosphorus or 11-29 PCU). These results mirror those in Leech et al. (2018) and suggest that despite some lakes changing trophic states, the majority of lakes do not transition and those that do transition usually fall along an edge of a NCP-determined trophic state. Thus, for lakes sampled twice in one sampling campaign, we averaged total phosphorus and true color estimates.

Second, to match the in situ trophic states with remotely sensed imagery, we merged the complete 2007, 2012, and 2017 NLA dataset with the LimnoSat-US dataset (Topp et al. 2020), where each NLA lake-year had corresponding Landsat spectral data. Because the NLA is designed to describe lakes’ summertime conditions, we filtered LimnoSat-US observances for those only occurring in June, July, and August, which we a priori defined as the summertime season for the contiguous U.S.; then, to create a characteristic reflectance for a given lake-year, we computed each lake-year’s median summertime reflectance for red, blue, green, and near-infrared bands. Because LimnoSat-US compiles reflectance values from Landsat 5, 7, and 8, there are differences in the number of images per lake and year. In particular, images from 1984 through 1998 were solely collected from Landsat 5, when lakes averaged 3.04 images per summer (min: 2.43 images; max: 3.64 images). From 1999 through 2012, summertime imagery was gathered from Landsat 5 and 7, when lakes averaged 5.64 images per summer (min: 3.37 images; max: 6.42 images). From 2013 through 2019, summertime imagery was collected from Landsat 7 and 8, when lakes averaged 5.42 images per summer (min: 4.87 images; max: 5.87 images).

Third, to better characterize spectral bands’ relative reflectance, we normalized each lake’s summertime median red, green, blue, and near-infrared band by the sum of all four bands. This normalization allowed us to differentiate lakes by trophic state based on their most prominent reflectances. For example, we anticipated that oligotrophic lakes would be dominated by blue and green reflectances relative to the red and near-infrared bands. In contrast, dystrophic lakes would be dominated by the near-infrared and blue bands relative to green and red bands. These relative reflectances were ultimately intended to discriminate among lakes that were optically similar in the visible spectrum (i.e., oligotrophic and dystrophic lakes). Notably, the decision to use median summertime relative reflectances differed from previous work (e.g., Topp et al. 2020) that focused on the dominant wavelength, which was an aggregation of wavelengths detected in the visible spectrum and has been used to discriminate autotrophic production (i.e., blue vs green lakes), but not dystrophic states. Thus, our methods are better suited towards discriminating between oligotrophic and dystrophic lakes, whereas the dominant wavelength approach would consider both of these lake types to be “blue”.

Step 4: Create Classification Models

To find an optimal performing classifier for lakes with unknown LTS, we employed three classification methods to predict trophic state: multinomial logistic regression (Venables and Ripley 2002), extreme gradient boosting regression (Friedman 2001), and a neural network using multilayer perceptrons (Rosenblatt 1958). Logistic regression is a parametric classification method, whereas gradient boosted regression and multilayer perceptrons are machine learning methods. The methods differ in how they make classifications. Using trophic state as a categorical response variable, logistic regression applies a linear regression of log-odds ratios to model the probability of a given trophic state for each lake. In contrast, gradient boosted regression applies decision trees to iteratively improve its predictions. Multilayer perceptrons apply a type of feedforward artificial neural network in which a backpropagation algorithm is used to subsequently update the individual weights of each neuron unit by comparing modeled predictions to the training data.

For each modeling method, we used z-scored, relative red, green, blue, and near-infrared reflectances for input data. Model performance and potential for overfitting were assessed using a 90:10 train:test data split with spatial-holdout cross-validation.

Initial hyperparameters for the gradient boosted regression and multilayer perceptron models were tuned by holding out 20% of each trophic class from the training observations to use for validation and conducting a coarse grid-search across the hyperparameter space. For each combination of hyperparameters, models were trained until validation performance did not increase for 20 consecutive epochs using categorical cross entropy as the objective function. During the multilayer perceptron hyperparameter tuning, we iterated through model fits using all combinations of 5, 10, and 20 hidden layers as well as a learning rate of 0.01, 0.001, and 0.0005. Multilayer perceptron hyperparameter tuning metrics were optimal for models with 20 hidden units and a learning rate of 0.001. During the gradient boosted regression hyperparameter tuning, we iterated through model fits using all combinations of 2, 3, and 4 maximum tree depths, subsample as well as column samples of 0.5 and 0.8, step sizes of 0.01 and 0.1, as well as a minimum child weight of 1 and 3. Gradient boosted regression hyperparameter tuning metrics were optimal for models with a max depth of 4, subsample of 0.5, column sample of 0.5, step size of 0.01, and minimum child weight of 1. For both multilayer perceptron and gradient boosted regression models, best performing hyperparameter tuning metrics were assessed by having lowest validation loss values.

These hyperparameters were then used in a spatial cross-validation routine (sensu Willard et al. 2021), where a given lake was held out of test data if it was included in the training data. During the spatial cross-validation routine, training data were divided into five folds, such that lakes within each test partition were not present in remaining training partitions (i.e., test metrics represent performance on unseen lakes). Training data within each fold were then partitioned into a 90:10 split with 10% of each trophic class set aside for an inner-loop fold validation. Models within each fold were trained using an early stopping criteria of 20 epochs to avoid overfitting on the training data. This inner-fold validation was additionally used to hypertune the best number of epochs for the final models. Finally, overall error metrics were calculated based on the mean prediction accuracy of the test partitions withheld from the inner-loop training of each fold. All reported metrics are based on the test partitions from the spatial cross-validation routine while final models were trained on the full dataset using the hyperparameters identified from the grid-search and inner-loop validation routines.

Model diagnostics and performance were calculated using test data, but the final models used to create the final dataset were constructed using all of the data in LimnoSat-US. Once final models were validated for performance, we applied the final models to make predictions for all 56,000 lakes in the LimnoSat-US dataset.

Step 5: Assess and Compare Model Performance

To evaluate the final fitted models, we used test data predictions from the spatial-holdout routine to calculate each model’s overall and balanced accuracy, receiver-operator-characteristic (ROC) curves, as well as the area under the curve (AUC) of the ROC curve. Overall accuracy was calculated as the sum of true positives and true negatives divided by the total number of LTS predictions. Balanced accuracy was calculated as the sum of a true positive and true negative result for a single lake trophic state. Whereas overall accuracy can be biased towards more prevalent trophic states (i.e., eutrophic and oligotrophic lakes), balanced accuracy is useful to assess a model’s capacity to predict more rare trophic states (i.e., dystrophic lakes). As an additional metric of model performance, we calculated the AUC of each model’s ROC curve. The ROC curve visually graphs the relationship between the rate of a correct classification with the rate of a false classification. An AUC of 0.5 indicates a false prediction rate increases 1:1 with the rate of a correct prediction. AUCs greater than 0.5 imply a model performing better than random, even when a false positive rate is artificially inflated. Thus, comparing overall and balanced accuracy as well as ROC curves and AUCs allowed us to assess how models performed broadly as well as how robustly models predicted trophic state correctly.

Beyond model performance, we also evaluated whether model coefficients and variable importance for trophic state discrimination reflected NCP groupings. For increased interpretability across all three models, we employed SHAP (SHapley Additive exPlanation) analysis (Shapley 1953; Štrumbelj and Kononenko 2014; Lundberg and Lee 2017) to better understand individual feature importance and influence in model predictions. SHAP analysis yields insight into the marginal contribution of a given feature (e.g., near-infrared spectra) on model output - in this case trophic state - and helps decode ‘black box’ results. Understanding the relative contribution of individual features in trophic state prediction not only helps explain feature roles in model accuracy and misclassification but also quantitatively connects features, such as remotely sensed data, to the biophysical parameters in which LTS prediction is grounded. SHAP feature contribution was calculated for blue, green, red, and near-infrared Landsat spectra. SHAP feature contribution was scored for oligotrophic, dystrophic, and eutrophic classifications and across each of the three models. This scoring illuminates the relationship among feature values and SHAP contribution for a given trophic state classification and for a given model. Specifically, for classification problems, a positive SHAP value indicates that a given input contributed to a positive classification while a negative value indicates the input contributed to a low probability for a given classification.

Code Availability

All data harmonization, modeling, and validation procedures for the LTS-US dataset were scripted in the R Statistical Environment (R Core Team 2022), using the tidyverse (Wickham et al. 2019), lubridate (Grolemund and Wickham 2011), data.table (Dowle and Srinivasan 2021), sf (Pebesma 2018), keras (Allaire and Chollet 2022), tensorflow (Allaire and Tang 2022), caret (Kuhn 2022), CAST (Meyer et al. 2022), yaml (Garbett et al. 2022), reticulate (Ushey et al. 2022), xgboost (Chen et al. 2022), nnet (Venables and Ripley 2002), viridis (Garnier et al. 2021), trend (Pohlert 2020), multiROC (Wei and Wang 2018), ggpubr (Kassambara 2020), fastshap (Greenwell 2021), maps (Becker et al. 2021), ggtext (Wilke and Wiernik 2022), and ggforce (Pedersen 2022) packages.

To enhance reproducibility, all scripts are designed to work within a single pipeline that uses the targets package (Landau 2021). The targets pipeline is divided into four main components: “1_aggregate”, “2_train”, “3_predict”, and “4_qc”. Each component corresponds to one of the steps presented above and can be customized by future users to fit their specific needs. The associated pipeline setup and user guide can be found on the dataset’s companion Git repository, where the main ReadMe file details directory architecture and how to execute the pipeline.

To ensure reproducibility across operating platforms, all scripts for the pipeline can be executed within a container. Running the pipeline within the container allows users to execute the entire pipeline without the need to make small, yet important, edits to the code, or to configure their own operating environment to conform to the pipeline’s requirements. For example, recent versions of the sf package default to using the s2 spherical geometry engine instead of GEOS, which assumes planar coordinates. End users on a system with one version of the sf library might need to adjust the code to use the correct geometry engine, whereas users with another version might be able to run the pipeline without any adjustments. The container crystallizes a known-working set of libraries, both at the system level (e.g. GEOS, GDAL, PROJ) and at the R level (e.g. sf), so that anybody can run the code without reconfiguring their own environment. This also provides future-proofing by ensuring that the inevitable changes to other libraries over time do not lead to errors. To help end users, who are less familiar with running containerized code, a tutorial for installing and executing the pipeline within the container is located in the Environmental Data Initiative repository as a compressed entity (see README.pdf).

References

Allaire, J. J., and F. Chollet. 2022. keras: R Interface to “Keras,.”

Allaire, J. J., and Y. Tang. 2022. tensorflow: R Interface to “TensorFlow,.”

APHA. 1999. Standard Methods for the Examination of Water and Wastewater. American Public Health Association, Washington DC., American Public Health Association.

Becker, O. S. code by R. A., A. R. W. R. version by R. B. E. by T. P. Minka, and A. Deckmyn. 2021. maps: Draw Geographical Maps,.

Carlson, R. E. 1977. A trophic state index for lakes. Limnol. Oceanogr. 22: 361–369. doi:10.4319/lo.1977.22.2.0361

Chen, T., T. He, M. Benesty, and others. 2022. xgboost: Extreme Gradient Boosting,.

Dowle, M., and A. Srinivasan. 2021. data.table: Extension of `data.frame`,.

Foga, S., P. L. Scaramuzza, S. Guo, and others. 2017. Cloud detection algorithm comparison and validation for operational Landsat data products. Remote Sens. Environ. 194: 379–390. doi:10.1016/j.rse.2017.03.026

Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29: 1189–1232. doi:10.1214/aos/1013203451

Garbett, S. P., J. Stephens, K. Simonov, and others. 2022. yaml: Methods to Convert R Data to YAML and Back,.

Garnier, Simon, Ross, and others. 2021. viridis - Colorblind-Friendly Color Maps for R,.

Greenwell, B. 2021. fastshap: Fast Approximate Shapley Values,.

Grolemund, G., and H. Wickham. 2011. Dates and Times Made Easy with lubridate. J. Stat. Softw. 40: 1–25.

Jones, J. W. 2019. Improved Automated Detection of Subpixel-Scale Inundation—Revised Dynamic Surface Water Extent (DSWE) Partial Surface Water Tests. Remote Sens. 11: 374. doi:10.3390/rs11040374

Jӓrnefelt, H. 1925. Zur Limnologie einiger Gewӓsser Finnlands. Soc Zool Bot Fenn. Vanamo 2: 185–352.

Kassambara, A. 2020. ggpubr: “ggplot2” Based Publication Ready Plots,.

Kuhn, M. 2022. caret: Classification and Regression Training,.

Landau, W. M. 2021. The targets R package: a dynamic Make-like function-oriented pipeline toolkit for reproducibility and high-performance computing. J. Open Source Softw. 6: 2959.

Leech, D. M., A. I. Pollard, S. G. Labou, and S. E. Hampton. 2018. Fewer blue lakes and more murky lakes across the continental U.S.: Implications for planktonic food webs. Limnol. Oceanogr. 63: 2661–2680. doi:10.1002/lno.10967

Lundberg, S. M., and S.-I. Lee. 2017. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems. Curran Associates, Inc.

Messager, M. L., B. Lehner, G. Grill, I. Nedeva, and O. Schmitt. 2016. Estimating the volume and age of water stored in global lakes using a geo-statistical approach. Nat. Commun. 7: 13603. doi:10.1038/ncomms13603

Meyer, H., C. Milà, and M. Ludwig. 2022. CAST: “caret” Applications for Spatial-Temporal Models,.

Naumann, E. 1917. Undersӧkningar ӧver fytoplankton och under den pelagiska regionen fӧsiggående gyttje-och dybildningar inom vissa syd- och mellansvenska urbergsvatten. K Sv Vetensk Akad Handl 56: 1–165.

Nürnberg, G. K., and M. Shaw. 1998. Productivity of clear and humic lakes: nutrients, phytoplankton, bacteria. Hydrobiologia 382: 97–112. doi:10.1023/A:1003445406964

Pebesma, E. 2018. Simple Features for R: Standardized Support for Spatial Vector Data. R J. 10: 439–446. doi:10.32614/RJ-2018-009

Pedersen, T. L. 2022. ggforce: Accelerating “ggplot2,.”

Pohlert, T. 2020. trend: Non-Parametric Trend Tests and Change-Point Detection,.

R Core Team. 2022. R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing.

Robinson, N., J. Regetz, and R. P. Guralnick. 2014. EarthEnv-DEM90: A nearly-global, void-free, multi-scale smoothed, 90m digital elevation model from fused ASTER and SRTM data. ISPRS J. Photogramm. Remote Sens. 87: 57–67. doi:10.1016/j.isprsjprs.2013.11.002

Rohde, W. 1969. Crystallization of Eutrophication Concepts in Northern Europe, p. 20256. In Eutrophication: Causes, Consequences, Correctives. National Academies Press.

Rosenblatt, F. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 65: 386–408. doi:10.1037/h0042519

Shapley, L. S. 1953. 17. A Value for n-Person Games, p. 307–318. In H.W. Kuhn and A.W. Tucker [eds.], Contributions to the Theory of Games (AM-28), Volume II. Princeton University Press.

Shen, Z., X. Yu, Y. Sheng, J. Li, and J. Luo. 2015. A Fast Algorithm to Estimate the Deepest Points of Lakes for Regional Lake Registration. PLOS ONE 10: e0144700. doi:10.1371/journal.pone.0144700

Štrumbelj, E., and I. Kononenko. 2014. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41: 647–665. doi:10.1007/s10115-013-0679-x

Thienemann, A. 1921. Seetypen. Naturwissenschaften 9.

Topp, S., T. Pavelsky, X. Yang, J. Gardner, and M. R. V. Ross. 2020. LimnoSat-US: A Remote Sensing Dataset for U.S. Lakes from 1984-2020.doi:10.5281/zenodo.4139695

USEPA. 1987. Handbook of Methods for Acid Deposition Studies: Laboratory Analyses for Surface Water Chemistry, U.S. Environmental Protection Agency, Office of Research and Development.

USEPA. 2007. Survey of the Nation’s Lakes. Field Operations Manual. EPA 841-B-07004. EPA 841-B-07004 U.S. Environemtnal Protection Agency.

USEPA. 2011. 2012 National Lakes Assessment. Field Operations Manual. EPA 841-B-11-003. EPA 841-B-11-003 U.S. Environemtnal Protection Agency.

USEPA. 2012. National Lakes Assessment. Laboratory Operations Manual. EPA-841-B-11-004. EPA-841-B-11-004 U.S. Environemtnal Protection Agency.

USEPA. 2017a. National Lakes Assessment 2017. Field Operations Manual. EPA 841-B-16-002. EPA 841-B-16-002 U.S. Environemtnal Protection Agency.

USEPA. 2017b. National Lakes Assessment 2017. Laboratory Operations Manual. V.1.1. EPA 841‐B‐16‐ 004. EPA 841‐B‐16‐ 004 U.S. Environemtnal Protection Agency.

Ushey, K., J. J. Allaire, and Y. Tang. 2022. reticulate: Interface to “Python,.”

Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S, Fourth. Springer.

Webster, K. E., P. A. Soranno, K. S. Cheruvelil, and others. 2008. An empirical evaluation of the nutrient-color paradigm for lakes. Limnol. Oceanogr. 53: 1137–1148. doi:10.4319/lo.2008.53.3.1137

Wei, R., and J. Wang. 2018. multiROC: Calculating and Visualizing ROC and PR Curves Across Multi-Class Classifications,.

Wetzel, R. G. 2001. Limnology: Lake and River Ecosystems, 3rd edition. Academic Press.

Wickham, H., M. Averick, J. Bryan, and others. 2019. Welcome to the tidyverse. J. Open Source Softw. 4: 1686. doi:10.21105/joss.01686

Wilke, C. O., and B. M. Wiernik. 2022. ggtext: Improved Text Rendering Support for “ggplot2,.”

Willard, J. D., J. S. Read, A. P. Appling, S. K. Oliver, X. Jia, and V. Kumar. 2021. Predicting Water Temperature Dynamics of Unmonitored Lakes With Meta-Transfer Learning. Water Resour. Res. 57: e2021WR029579. doi:10.1029/2021WR029579

Williamson, C. E., D. P. Morris, M. L. Pace, and O. G. Olson. 1999. Dissolved organic carbon and nutrients as regulators of lake ecosystems: Resurrection of a more integrated paradigm. Limnol. Oceanogr. 44: 795–803. doi:10.4319/lo.1999.44.3_part_2.0795

People and Organizations

Publishers:
Organization:Environmental Data Initiative
Email Address:
info@edirepository.org
Web Address:
https://edirepository.org
Id:https://ror.org/0330j0z60
Creators:
Individual:Dr Michael F Meyer
Organization:U.S. Geological Survey
Position:Research Geographer
Address:
1 Gifford Pinchot Dr,
Madison, Wisconsin 53562 United States
Phone:
3142582927 (voice)
Email Address:
mfmeyer@usgs.gov
Id:https://orcid.org/0000-0002-8034-9434
Individual:Dr Simon N Topp
Organization:U.S. Geological Survey
Position:Research Physical Scientist
Address:
102 Eugene St,
Carrboro, NC 27510 USA
Email Address:
stopp@usgs.gov
Id:https://orcid.org/0000-0001-7741-5982
Individual:Dr. Tyler V King
Organization:U.S. Geological Survey
Position:Hydrologist
Address:
230 N. Collins Rd,
Boise, Idaho 83702 United States
Email Address:
tvking@usgs.gov
Id:https://orcid.org/0000-0002-5785-3077
Individual:Dr Robert Ladwig
Organization:Center for Limnology
Position:Post Doctoral Researcher
Address:
680 N Park St,
Madison, WI 53706 USA
Email Address:
rladwig2@wisc.edu
Id:https://orcid.org/0000-0001-8443-1999
Individual:Dr Rachel M Pilla
Organization:Oak Ridge National Lab
Address:
1 Bethel Valley Road,
Oak Ridge, TN 37831 USA
Email Address:
pillarm@ornl.gov
Id:https://orcid.org/0000-0001-9156-9486
Individual:Dr Hilary A Dugan
Organization:Center for Limnology
Position:Associate Professor
Address:
680 N Park St,
Madison, WI 53706 USA
Email Address:
hdugan@wisc.edu
Id:https://orcid.org/0000-0003-4674-1149
Individual:Dr Jack R Eggleston
Organization:U.S. Geological Survey
Position:Branch Chief, Hydrologic Remote Sensing Branch
Address:
11649 Leetown Road,
Kearneysville, WV 25430 United States
Email Address:
jegglest@usgs.gov
Id:https://orcid.org/0000-0001-6633-3041
Individual:Dr Stephanie E Hampton
Organization:Carnegie Institution for Science
Position:Deputy Director
Address:
1200 E. California Blvd. ,
Pasadena, CA 91125 USA
Email Address:
shampton@carnegiescience.edu
Id:https://orcid.org/0000-0003-2389-4249
Individual:Dr Dina M Leech
Organization:Longwood University
Position:Associate Professor
Address:
201 High Street,
Farmville, VA 23909 USA
Email Address:
leechdm@longwood.edu
Id:https://orcid.org/0000-0002-0674-3433
Individual:Dr Isabella A Oleksy
Organization:University of Wyoming
Position:Post Doctoral Researcher
Address:
1000 E University Ave,
Laramie, WY 82071 USA
Email Address:
bellaoleksy@gmail.com
Id:https://orcid.org/0000-0003-2572-5457
Individual: Jesse C Ross
Organization:U.S. Geological Survey
Address:
Los Angeles, CA 90018 United States
Email Address:
jross@usgs.gov
Id:https://orcid.org/0000-0002-5422-8284
Individual:Dr Matthew RV Ross
Organization:Colorado State University
Address:
Fort Collins, CO 80523 United States
Email Address:
mrvr@colostate.edu
Id:https://orcid.org/0000-0001-9105-4255
Individual:Dr R Iestyn Woolway
Organization:Bangor University
Position:Assistant Professor
Address:
Askew St,
Menai Bridge, Anglesey LL59 5AB United Kingdom
Email Address:
iestyn.woolway@bangor.ac.uk
Id:https://orcid.org/0000-0003-0498-7968
Individual:Dr Xiao Yang
Organization:Southern Methodist University
Position:Assistant Professor
Address:
P.O. Box 750395,
Dallas, TX USA
Email Address:
xnayang@smu.edu
Id:https://orcid.org/0000-0002-0046-832X
Individual: Matthew R Brousil
Organization:Colorado State University
Address:
Fort Collins, CO 80523 United States
Email Address:
mbrousil@colostate.edu
Id:https://orcid.org/0000-0001-8229-9445
Individual:Dr Kate C Fickas
Organization:U.S. Geological Survey
Address:
47914 252nd St,
Sioux Falls, SD 57198 United States
Email Address:
kfickas@usgs.gov
Id:https://orcid.org/0000-0002-6617-2441
Individual:Dr Julie C Padowski
Organization:Washington State University
Address:
2001 Grimes Way,
Pullman, WA 93164 United States
Email Address:
julie.padowski@wsu.edu
Id:https://orcid.org/0000-0003-2337-4243
Individual:Dr Amina I Pollard
Organization:U.S. Environmental Protection Agency
Address:
1200 Pennsylania Ave,
Washington, DC, 20460 United States
Email Address:
pollard.amina@epa.gov
Id:https://orcid.org/0000-0002-5010-0961
Individual:Dr Jianning Ren
Organization:University of Nevada - Reno
Position:Post Doctoral Researcher
Address:
1664 N. Virginia Street,
Reno, NV 89557 USA
Email Address:
nren@unr.edu
Id:https://orcid.org/0000-0002-5849-2189
Individual:Dr Jacob A Zwart
Organization:U.S. Geological Survey
Position:Research Data Scientist
Address:
2367 44th Ave,
San Franisco, CA 94116 United States
Email Address:
jzwart@usgs.gov
Id:https://orcid.org/0000-0002-3870-405X
Contacts:
Individual:Dr Michael F Meyer
Organization:U.S. Geological Survey
Position:Research Geographer
Address:
1 Gifford Pinchot Dr,
Madison, Wisconsin 53726 United States
Phone:
3142582927 (voice)
Email Address:
mfmeyer@usgs.gov
Id:https://orcid.org/0000-0002-8034-9434
Metadata Providers:
Individual:Dr Michael F Meyer
Organization:U.S. Geological Survey
Position:Research Geographer
Address:
1 Gifford Pinchot Dr,
Madison, Wisconsin 53726 United States
Phone:
3142582927 (voice)
Email Address:
mfmeyer@usgs.gov
Id:https://orcid.org/0000-0002-8034-9434

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:

Time Period
Begin:
1984
End:
2020
Geographic Region:
Description:Contiguous United States and southern Canada
Bounding Coordinates:
Northern:  49.22Southern:  24.55
Western:  -125.54Eastern:  -65.41

Project

Parent Project Information:

Title:Remote Sensing of Water Quality
Personnel:
Individual:Dr Michael F Meyer
Organization:U.S. Geological Survey
Position:Research Geographer
Address:
1 Gifford Pinchot Dr,
Madison, Wisconsin 53718 United States
Phone:
3142582927 (voice)
Email Address:
mfmeyer@usgs.gov
Id:https://orcid.org/0000-0002-8034-9434
Role:Project Lead

Maintenance

Maintenance:
Description:

Data have been completed, and this dataset is static unless other persons, inclusive to co-authors, would like to update the dataset.

Frequency:notPlanned
Other Metadata

Additional Metadata

additionalMetadata
        |___text '\n    '
        |___element 'metadata'
        |     |___text '\n      '
        |     |___element 'unitList'
        |     |     |___text '\n        '
        |     |     |___element 'unit'
        |     |     |     |  \___attribute 'id' = 'unitless'
        |     |     |     |  \___attribute 'name' = 'unitless'
        |     |     |     |___text '\n          '
        |     |     |     |___element 'description'
        |     |     |     |     |___text 'probability'
        |     |     |     |___text '\n        '
        |     |     |___text '\n        '
        |     |     |___element 'unit'
        |     |     |     |  \___attribute 'id' = 'year'
        |     |     |     |  \___attribute 'name' = 'year'
        |     |     |     |___text '\n          '
        |     |     |     |___element 'description'
        |     |     |     |___text '\n        '
        |     |     |___text '\n      '
        |     |___text '\n    '
        |___text '\n  '

Additional Metadata

additionalMetadata
        |___text '\n    '
        |___element 'metadata'
        |     |___text '\n      '
        |     |___element 'emlEditor'
        |     |        \___attribute 'app' = 'ezEML'
        |     |        \___attribute 'release' = '2023.02.19'
        |     |___text '\n    '
        |___text '\n  '

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

UNM logo UW-M logo