Data Package Metadata View Summary

National-scale, remotely sensed lake trophic state (LTS-US) 1984-2020

General Information

Data Package:

Local Identifier:

edi.78.2

Title:

National-scale, remotely sensed lake trophic state (LTS-US) 1984-2020

Alternate Identifier:

DOI PLACE HOLDER

Abstract:

Lake trophic state is a key water quality property that integrates a lake’s physical, chemical, and biological processes. Despite the importance of trophic state as a gauge of lake water quality, standardized and machine readable observations are uncommon. Remote sensing presents an opportunity to detect and analyze lake trophic state with reproducible, robust methods across time and space. We used Landsat surface reflectance and lake morphometric data to create the first compendium of lake trophic state for more than 56,000 lakes of at least 10 ha in size throughout the contiguous United States from 1984 through 2020. The dataset was constructed with FAIR data principles (Findable, Accessible, Interoperable, and Reproducible) in mind, where data are publicly available, relational keys from parent datasets are retained, and all data wrangling and modeling routines are scripted for future reuse. Together, this resource offers critical data to address basic and applied research questions about lake water quality at a suite of spatial and temporal scales.

Publication Date:

2023-03-14

For more information:
Visit:	DOI PLACE HOLDER

Time Period

Begin:

1984

End:

2020

People and Organizations
Contact:	Meyer, Michael F (U.S. Geological Survey, Research Geographer) [ email ]
Creator:	Meyer, Michael F (U.S. Geological Survey, Research Geographer)
Creator:	Topp, Simon N (U.S. Geological Survey, Research Physical Scientist)
Creator:	King, Tyler V (U.S. Geological Survey, Hydrologist)
Creator:	Ladwig, Robert (Center for Limnology, Post Doctoral Researcher)
Creator:	Pilla, Rachel M (Oak Ridge National Lab)
Creator:	Dugan, Hilary A (Center for Limnology, Associate Professor)
Creator:	Eggleston, Jack R (U.S. Geological Survey, Branch Chief, Hydrologic Remote Sensing Branch)
Creator:	Hampton, Stephanie E (Carnegie Institution for Science, Deputy Director)
Creator:	Leech, Dina M (Longwood University, Associate Professor)
Creator:	Oleksy, Isabella A (University of Wyoming, Post Doctoral Researcher)
Creator:	Ross, Jesse C (U.S. Geological Survey)
Creator:	Ross, Matthew RV (Colorado State University)
Creator:	Woolway, R Iestyn (Bangor University, Assistant Professor)
Creator:	Yang, Xiao (Southern Methodist University, Assistant Professor)
Creator:	Brousil, Matthew R (Colorado State University)
Creator:	Fickas, Kate C (U.S. Geological Survey)
Creator:	Padowski, Julie C (Washington State University)
Creator:	Pollard, Amina I (U.S. Environmental Protection Agency)
Creator:	Ren, Jianning (University of Nevada - Reno, Post Doctoral Researcher)
Creator:	Zwart, Jacob A (U.S. Geological Survey, Research Data Scientist)

Data Entities
Data Table Name:	ensemble_predictions
Description:	Tabular dataset of mean probabilities of a given lake and year being of a given trophic state. A categorical variable (categorical_ts) defines which mean probability is highest.
Data Table Name:	individual_predictions
Description:	Tabular dataset of individual predictions for each lake trophic state and for each modeling technique.
Other Name:	README_targets
Description:	Instructions for setting up and running all supporting R code in the targets pipeline framework
Other Name:	README_container
Description:	README for using the Docker Container with the targets pipeline
Other Name:	scripts
Description:	Scripts associated with production of the data product
Other Name:	lts_container
Description:	Docker container for compute environment. This docker container allows future users to recreate the exact compute environment, where all datasets and quality control products were generated.
Other Name:	lake_trophic_status_docker_image
Description:	Pre-built docker image for running LTS-US pipeline in a containerized environment. Please see "README_container.pdf" for instructions on running the pipeline in a Docker container.

Detailed Metadata

Data Entities

Data Table


Data:	https://pasta-s.lternet.edu/package/data/eml/edi/78/2/2408a538ea395572153e16b189ccacc6
Name:	ensemble_predictions
Description:	Tabular dataset of mean probabilities of a given lake and year being of a given trophic state. A categorical variable (categorical_ts) defines which mean probability is highest.
Number of Records:	2059494
Number of Columns:	9

Table Structure

Object Name:

ensemble_predictions.csv

Size:

254774297 byte

Authentication:

580e66762b392bb0ed528513d5748b7e Calculated By MD5

Text Format:

Number of Header Lines:

Record Delimiter:

Orientation:

column

Simple Delimited:

Field Delimiter:	,
Quote Character:	"

Table Column Descriptions

Hylak_id

year

mean_prob_dys

var_prob_dys

mean_prob_eumixo

var_prob_eumixo

mean_prob_oligo

var_prob_oligo

categorical_ts

Column Name:

Hylak_id

year

mean_prob_dys

var_prob_dys

mean_prob_eumixo

var_prob_eumixo

mean_prob_oligo

var_prob_oligo

categorical_ts

Definition:

HydroLAKES unique identifier of lake. Preserved from HydroLAKES input data to enable future merge with HydroLAKES attributes.

Year, spans 1984 through 2020.

Probability that a lake-year combination is dystrophic. Probability is calculated by averaging probabilities from multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods.

Variance in probabilities among multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods that a given lake-year is dystrophic.

Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by averaging probabilities from multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods.

Variance in probabilities among multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods that a given lake-year is eutrophic or mixotrophic.

Probability that a lake-year combination is oligotrophic. Probability is calculated by averaging probabilities from multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods.

Variance in probabilities among multinomial logistic regression, gradient boosted regression trees, and multilayer perceptron modeling methods that a given lake-year is oligotrophic.

Categorical predicted lake trophic status (i.e., oligotrophic, eutrophic_mixotrophic, dystrophic). Categorical prediction is based on the highest probability among mean_prob_dys, mean_prob_eumixo, and mean_prob_oligo.

Storage Type:

float

string

Measurement Type:

ratio

nominal

Measurement Values Domain:

Unit	unitless
Type	real
Min	5
Max	1069952

Unit	year
Type	integer
Min	1984
Max	2020

Unit	unitless
Type	real
Min	0.004105521
Max	0.8159664

Unit	unitless
Type	real
Min	6.126111e-10
Max	0.1707376

Unit	unitless
Type	real
Min	0.01323732
Max	0.987529

Unit	unitless
Type	real
Min	7.564333e-10
Max	0.1328802

Unit	unitless
Type	real
Min	0.005209238
Max	0.9684982

Unit	unitless
Type	real
Min	1.558369e-09
Max	0.1790787

Allowed Values and Definitions

Enumerated Domain

Code Definition

Code	NA
Definition	No prediction because missing appropriate satellite data for a given lake-year combination
Source

Code Definition

Code	dys
Definition	Lake-year combination classified as dystrophic
Source

Code Definition

Code	eu/mixo
Definition	Lake-year combination classified as eutrophic and/or mixotrophic
Source

Code Definition

Code	oligo
Definition	Lake-year combination classified as oligotrophic
Source

Missing Value Code:

Code	NA
Expl	Missing value

Code	NA
Expl	Missing value

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Accuracy Report:

Accuracy Assessment:

Coverage:

Methods:

Data Table


Data:	https://pasta-s.lternet.edu/package/data/eml/edi/78/2/95b484f3bf81cbe035e27ee51df4df32
Name:	individual_predictions
Description:	Tabular dataset of individual predictions for each lake trophic state and for each modeling technique.
Number of Records:	2059494
Number of Columns:	11

Table Structure

Object Name:

individual_predictions.csv

Size:

337130812 byte

Authentication:

0494d66398845c1ab4c8f753141de732 Calculated By MD5

Text Format:

Number of Header Lines:

Record Delimiter:

Orientation:

column

Simple Delimited:

Field Delimiter:	,
Quote Character:	"

Table Column Descriptions

Hylak_id

year

prob_dys_mlr

prob_eumixo_mlr

prob_oligo_mlr

prob_dys_mlp

prob_eumixo_mlp

prob_oligo_mlp

prob_dys_xgb

prob_eumixo_xgb

prob_oligo_xgb

Column Name:

Hylak_id

year

prob_dys_mlr

prob_eumixo_mlr

prob_oligo_mlr

prob_dys_mlp

prob_eumixo_mlp

prob_oligo_mlp

prob_dys_xgb

prob_eumixo_xgb

prob_oligo_xgb

Definition:

HydroLAKES unique identifier of lake. Preserved from HydroLAKES input data to enable future merge with HydroLAKES attributes.

Year, spans 1984 through 2020.

Probability that a lake-year combination is dystrophic. Probability is calculated by multinomial, multiple logistic regression.

Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by multinomial, multiple logistic regression.

Probability that a lake-year combination is oligotrophic. Probability is calculated by multinomial, multiple logistic regression.

Probability that a lake-year combination is dystrophic. Probability is calculated by multilayer perceptron.

Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by multilayer perceptron.

Probability that a lake-year combination is oligotrophic. Probability is calculated by multilayer perceptron.

Probability that a lake-year combination is dystrophic. Probability is calculated by a gradient-boosted regression tree.

Probability that a lake-year combination is eutrophic or mixotrophic. Probability is calculated by a gradient-boosted regression tree.

Probability that a lake-year combination is oligotrophic. Probability is calculated by a gradient-boosted regression tree.

Storage Type:

float

Measurement Type:

ratio

Measurement Values Domain:

Unit	unitless
Type	real
Min	5
Max	1069952

Unit	year
Type	integer

Unit	unitless
Type	real
Min	2.815192e-10
Max	0.9890227

Unit	unitless
Type	real
Min	1.577883e-10
Max	0.9999986

Unit	unitless
Type	real
Min	1.002697e-11
Max	0.999946

Unit	unitless
Type	real
Min	3.519079e-08
Max	0.9939494

Unit	unitless
Type	real
Min	0.0001922453
Max	1

Unit	unitless
Type	real
Min	1.925952e-11
Max	0.9991604

Unit	unitless
Type	real
Min	0.01076017
Max	0.817612

Unit	unitless
Type	real
Min	0.03078391
Max	0.9701928

Unit	unitless
Type	real
Min	0.01226485
Max	0.945454

Missing Value Code:

Code	NA
Expl	Missing value

Code	NA
Expl	Missing value

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Code	NA
Expl	Missing value due to lack of remote sensing data for a given lake-year combination

Accuracy Report:

Accuracy Assessment:

Coverage:

Methods:

Non-Categorized Data Resource

Name:

README_targets

Entity Type:

pdf

Description:

Instructions for setting up and running all supporting R code in the targets pipeline framework

Physical Structure Description:

Object Name:

README_targets.pdf

Size:

309691 byte

Authentication:

39f88d2b7f453e1f9447ab91a0ff4668 Calculated By MD5

Externally Defined Format:

Format Name:

pdf

Data:

https://pasta-s.lternet.edu/package/data/eml/edi/78/2/7e82333274c569dd5237cb19d9589b7f

Non-Categorized Data Resource

Name:

README_container

Entity Type:

pdf

Description:

README for using the Docker Container with the targets pipeline

Physical Structure Description:

Object Name:

README_container.pdf

Size:

58824 byte

Authentication:

693c904d49d434d40fa0ca0dd3b39f43 Calculated By MD5

Externally Defined Format:

Format Name:

pdf

Data:

https://pasta-s.lternet.edu/package/data/eml/edi/78/2/60b2b77971451b04f4af928171bea459

Non-Categorized Data Resource

Name:

scripts

Entity Type:

zipped folder

Description:

Scripts associated with production of the data product

Physical Structure Description:

Object Name:

scripts.zip

Size:

31603 byte

Authentication:

f4e3fa1636a8150ad19c76414d112344 Calculated By MD5

Externally Defined Format:

Format Name:

zip

Data:

https://pasta-s.lternet.edu/package/data/eml/edi/78/2/d6c5855a62cf32a4dadbc2831f0f295f

Non-Categorized Data Resource

Name:

lts_container

Entity Type:

zipped folder

Description:

Docker container for compute environment. This docker container allows future users to recreate the exact compute environment, where all datasets and quality control products were generated.

Physical Structure Description:

Object Name:

lts_container.zip

Size:

4047 byte

Authentication:

f59648229cf6416dc5f16e235256499c Calculated By MD5

Externally Defined Format:

Format Name:

zip

Data:

https://pasta-s.lternet.edu/package/data/eml/edi/78/2/d798db512884c21859258e8dcf36b98d

Non-Categorized Data Resource

Name:

lake_trophic_status_docker_image

Entity Type:

compressed tape archive

Description:

Pre-built docker image for running LTS-US pipeline in a containerized environment. Please see "README_container.pdf" for instructions on running the pipeline in a Docker container.

Physical Structure Description:

Object Name:

lake_trophic_status_docker_image.tar.gz

Size:

3738785336 byte

Authentication:

c1994ad28775b2d868256a91295b63b7 Calculated By MD5

Externally Defined Format:

Format Name:

tar.gz

Data:

https://pasta-s.lternet.edu/package/data/eml/edi/78/2/ab7b0d7f5f161515d0725aaf034afd94

Data Package Usage Rights

This data package is released to the "public domain" under Creative Commons CC0 1.0 "No Rights Reserved" (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein "website") in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available "as is" and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you.

Keywords

By Thesaurus:
LTER Controlled Vocabulary	lakes, water quality, remote sensing, limnology, ecology
(No thesaurus)	synthesis

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package

Description:

The LTS-US dataset is constructed in a three-part pipeline: (1) aggregate training data, (2) create classification models, and (3) apply predictions to lakes outside of the training data. Individual steps within the pipeline are described below.

Step 1: Identify Parent Datasets

United States Environmental Protection Agency National Lakes Assessment

In situ measurements of total phosphorus and true color were compiled from the U.S. National Lakes Assessment (NLA), a synoptic sampling campaign of lakes, ponds, and reservoirs, hereafter collectively referred to as “lakes”, conducted in the contiguous U.S. every five years. Lakes used in this analysis were sampled in the summer (June -September) of 2007 (n = 1028), 2012 (n = 1038), or 2017 (n = 1005). Lakes were selected from the National Hydrography Dataset (NHD, https://www.usgs.gov/national-hydrography/national-hydrography-dataset) using a randomized design stratified on aggregated Omernik level III ecoregion and lake surface area. The minimum surface area for inclusion in the 2007 assessment was 4 ha, but, owing to increasing resolution in the NHD, was reduced to 1 ha for the 2012 and 2017 assessments. Natural lakes and reservoirs were treated equally for site selection purposes. A wide set of measurements were collected at each sampled lake, but we only provide details on the variables used in this analysis. Additional details, protocols, and data are available online (https://www.epa.gov/national-aquatic-resource-surveys/nla).

Total phosphorus and true color were collected and processed in the 2007, 2012, and 2017 field campaigns (USEPA 2007, 2011, 2017a). Water samples were collected from a deep, open water (up to 50 m deep) location in natural lakes and at a midpoint in reservoirs. Water was collected from the photic zone using a vertical, depth integrated sampling device. True color was estimated by visual comparison of filtered water samples to a calibrated glass color disk (USEPA 1987). Total phosphorus concentrations were measured with manual alkaline persulfate digestion, followed by automated colorimetric analysis (ammonium molybdate and antimony potassium tartrate under acidic conditions, with absorbance at 880 nm) using a flow injection analyzer following standard method 4500-P-E (APHA 1999). Detailed descriptions of all water quality analyses are available in the NLA Laboratory Operations Manuals (USEPA 2007, 2012, 2017b).

HydroLAKES

HydroLAKES (Messager et al. 2016) is a compendium of more than 1.4 million lake and reservoir shapefiles globally, with surface area of at least 10 ha. For an individual waterbody, HydroLAKES contains its spatial extent and location (using georeferenced polygons), a unique identifier (ranging from 1 to 1,427,688), and its morphological (area, mean depth, elevation, shoreline length etc.), hydrological (e.g., residence time, discharge, watershed area, watershed area), and geographical (e.g., name, country, continent) properties. HydroLAKES is a compilation of existing lake databases, with sources from government agencies (e.g., Natural Resources Canada, U.S. Geological Survey, European Environment Agency) and from remote sensing studies (for example, Shuttle Radar Topographic Mission Water Body Data, Global Lakes and Wetlands Database, and Global Reservoir and Dam Database). Most of the lake polygons are sourced from the Shuttle Radar Topographic Mission Water Body Data for regions between 60ºS and 60ºN (Robinson et al. 2014), supplemented by other datasets for higher latitudes and for underrepresented regions. More detailed information on the creation and validation of the HydroLAKES dataset can be found in Messager et al. (2016).

LimnoSat

The LimnoSat-US (Topp et al. 2020) dataset comprises over 22 million remotely sensed observations of lake surface reflectance from 1984 to 2020. Observations cover over 50,000 lakes greater than 10 ha (Messager et al. 2016) aggregated from Landsat 5, 7, and 8 Collection 1 imagery. Each observation was calculated by taking the median surface reflectance within 120 meters of each lake’s Chebyshev center, defined as the point farthest from shore and usually is located at the lake’s deepest point (Shen et al. 2015). Extracting reflectance values from the Chebyshev center minimizes signals due to bottom reflectance and adjacent land pixels. For each observation, non-high confidence water pixels were masked using the Dynamic Surface Water Extent algorithm (Jones 2019). Observations were removed if the scene cloud cover was greater than 75%, any snow, ice, cloud, cloud shadow (Foga et al. 2017), or hillshade was detected over the lake’s Chebyshev center, or if there were fewer than eight high confidence water pixels within the 120 meter buffer of the lake’s Chebyshev center. For certain lakes, these filters lead to extended periods with limited observations. Data in LimnoSat-US are presented in a tabular format, where each row reflects a Landsat overpass for a given waterbody, and columns include median Collection 1 surface reflectance values by band extracted from pixels within 120 m of the Chebyshev center, scene wide cloud cover, date of imagery acquisition, and number of water pixels within 120 m of the Chebyshev center.

Step 2: Define Lake Trophic State

Many lakes across the United States are experiencing simultaneous changes in their water clarity, with some lakes getting greener due to eutrophication, and others getting browner from increasing terrestrially-derived organic matter, and some are simultaneously ‘greening’ and ‘browning’ (Leech et al. 2018). Given the need to discriminate between lakes that may be browning and/or greening, the Nutrient Color Paradigm (NCP) is a useful tool to assign LTS based on a lake’s characteristic color.

The NCP was initially proposed in the early 20th century, emphasizing that both autochthonous and allochthonous processes are important to understanding LTS (Naumann 1917; Thienemann 1921; Jӓrnefelt 1925). Specifically, water color often affects algal biomass and light transparency independent of nutrient availability. Rodhe (1969) first assembled the four quadrants of the NCP, placing autochthony on the horizontal axis and allochthony on the vertical axis. This second dimension distinguishes “oligotrophic” (low nutrient, low color) lakes, “eutrophic” (high nutrient, low color) lakes, “dystrophic” (low nutrient, high color) lakes, and “mixotrophic” (high nutrient, high color) lakes.

While metrics such as Trophic State Index (Carlson 1977) gained popularity for providing instantaneous assessments of a lake’s autotrophic production, Williamson et al. (1999) encouraged a focus on NCP for lake classification given the importance of both nutrients and dissolved organic matter to lake structure and function. The NCP’s implementation is empirically supported by studies like Webster et al. (2008), where an analysis of ~1,600 temperate lakes in North America demonstrated that within lakes grouped by total phosphorus concentration (i.e., oligotrophic, mesotrophic, or eutrophic), those with ‘browner’ color (indicative of dissolved organic matter) had higher volumetric chlorophyll-a concentrations and shallower Secchi disk depths. A similar pattern was observed by Nürnberg and Shaw (1998), which analyzed 600 lakes spanning a latitude of 39°S to 82°N.

Here, we used the thresholds published in Webster et al. (2008) to classify lakes in the NLA dataset. Lakes were described as oligotrophic or ‘blue’ if total phosphorus concentration was less than 30 μg/L and true color was less than 20 platinum cobalt units (PCU), eutrophic/mixotrophic or ‘green/murky’ if total phosphorus was greater than 30 μg/L, dystrophic or ‘brown’ if total phosphorus was less than 30 μg/L and true color greater than 20 PCU. Thresholds for total phosphorus are based on long established and widely accepted ranges affecting primary productivity (Wetzel 2001). True color thresholds are derived from Nürnberg and Shaw (1998).

Step 3: Create a training dataset

First, to create a dataset of lakes with in situ LTS measurements, we aggregated all total phosphorus and true color measurements from the US EPA NLA 2007, 2012, and 2017 data. While the NLA includes lakes smaller than 10 ha, we only used lakes of at least 10 ha in area, for consistency with the HydroLAKES database. We then assessed the extent to which seasonally shifting total phosphorus concentrations may alter interpretation of trophic state for a given lake using the subset of lakes that were sampled intra-annually. We calculated the percentage of lakes that transitioned between trophic states within a single year and found that lakes broadly remained in the same NCP trophic state throughout a given summer (85.1% of lakes). Of the lakes that changed trophic state during a sampling season (14.9%), the majority transitioned from oligotrophic (61.5% of changing lakes; 8.7% of all lakes) or dystrophic (15.4% of changing lakes; 2.2% of all lakes) to eutrophic/mixotrophic. Few lakes transitioned from oligotrophic to dystrophic (15.4% of changing lakes; 2.2% of all lakes), and even fewer transitioned to oligotrophic from either dystrophic (3.9% of changing lakes; 0.5% of all lakes) or eutrophic/mixotrophic (3.9% of changing lakes; 0.5% of all lakes). No lakes transitioned from eutrophic/mixotrophic to dystrophic across all three NLA campaigns. Broadly, lakes transitioned between trophic states when lakes were located near a threshold for trophic state delineation (15-45 μg/L total phosphorus or 11-29 PCU). These results mirror those in Leech et al. (2018) and suggest that despite some lakes changing trophic states, the majority of lakes do not transition and those that do transition usually fall along an edge of a NCP-determined trophic state. Thus, for lakes sampled twice in one sampling campaign, we averaged total phosphorus and true color estimates.

Second, to match the in situ trophic states with remotely sensed imagery, we merged the complete 2007, 2012, and 2017 NLA dataset with the LimnoSat-US dataset (Topp et al. 2020), where each NLA lake-year had corresponding Landsat spectral data. Because the NLA is designed to describe lakes’ summertime conditions, we filtered LimnoSat-US observances for those only occurring in June, July, and August, which we a priori defined as the summertime season for the contiguous U.S.; then, to create a characteristic reflectance for a given lake-year, we computed each lake-year’s median summertime reflectance for red, blue, green, and near-infrared bands. Because LimnoSat-US compiles reflectance values from Landsat 5, 7, and 8, there are differences in the number of images per lake and year. In particular, images from 1984 through 1998 were solely collected from Landsat 5, when lakes averaged 3.04 images per summer (min: 2.43 images; max: 3.64 images). From 1999 through 2012, summertime imagery was gathered from Landsat 5 and 7, when lakes averaged 5.64 images per summer (min: 3.37 images; max: 6.42 images). From 2013 through 2019, summertime imagery was collected from Landsat 7 and 8, when lakes averaged 5.42 images per summer (min: 4.87 images; max: 5.87 images).

Third, to better characterize spectral bands’ relative reflectance, we normalized each lake’s summertime median red, green, blue, and near-infrared band by the sum of all four bands. This normalization allowed us to differentiate lakes by trophic state based on their most prominent reflectances. For example, we anticipated that oligotrophic lakes would be dominated by blue and green reflectances relative to the red and near-infrared bands. In contrast, dystrophic lakes would be dominated by the near-infrared and blue bands relative to green and red bands. These relative reflectances were ultimately intended to discriminate among lakes that were optically similar in the visible spectrum (i.e., oligotrophic and dystrophic lakes). Notably, the decision to use median summertime relative reflectances differed from previous work (e.g., Topp et al. 2020) that focused on the dominant wavelength, which was an aggregation of wavelengths detected in the visible spectrum and has been used to discriminate autotrophic production (i.e., blue vs green lakes), but not dystrophic states. Thus, our methods are better suited towards discriminating between oligotrophic and dystrophic lakes, whereas the dominant wavelength approach would consider both of these lake types to be “blue”.

Step 4: Create Classification Models

To find an optimal performing classifier for lakes with unknown LTS, we employed three classification methods to predict trophic state: multinomial logistic regression (Venables and Ripley 2002), extreme gradient boosting regression (Friedman 2001), and a neural network using multilayer perceptrons (Rosenblatt 1958). Logistic regression is a parametric classification method, whereas gradient boosted regression and multilayer perceptrons are machine learning methods. The methods differ in how they make classifications. Using trophic state as a categorical response variable, logistic regression applies a linear regression of log-odds ratios to model the probability of a given trophic state for each lake. In contrast, gradient boosted regression applies decision trees to iteratively improve its predictions. Multilayer perceptrons apply a type of feedforward artificial neural network in which a backpropagation algorithm is used to subsequently update the individual weights of each neuron unit by comparing modeled predictions to the training data.

For each modeling method, we used z-scored, relative red, green, blue, and near-infrared reflectances for input data. Model performance and potential for overfitting were assessed using a 90:10 train:test data split with spatial-holdout cross-validation.

Initial hyperparameters for the gradient boosted regression and multilayer perceptron models were tuned by holding out 20% of each trophic class from the training observations to use for validation and conducting a coarse grid-search across the hyperparameter space. For each combination of hyperparameters, models were trained until validation performance did not increase for 20 consecutive epochs using categorical cross entropy as the objective function. During the multilayer perceptron hyperparameter tuning, we iterated through model fits using all combinations of 5, 10, and 20 hidden layers as well as a learning rate of 0.01, 0.001, and 0.0005. Multilayer perceptron hyperparameter tuning metrics were optimal for models with 20 hidden units and a learning rate of 0.001. During the gradient boosted regression hyperparameter tuning, we iterated through model fits using all combinations of 2, 3, and 4 maximum tree depths, subsample as well as column samples of 0.5 and 0.8, step sizes of 0.01 and 0.1, as well as a minimum child weight of 1 and 3. Gradient boosted regression hyperparameter tuning metrics were optimal for models with a max depth of 4, subsample of 0.5, column sample of 0.5, step size of 0.01, and minimum child weight of 1. For both multilayer perceptron and gradient boosted regression models, best performing hyperparameter tuning metrics were assessed by having lowest validation loss values.

These hyperparameters were then used in a spatial cross-validation routine (sensu Willard et al. 2021), where a given lake was held out of test data if it was included in the training data. During the spatial cross-validation routine, training data were divided into five folds, such that lakes within each test partition were not present in remaining training partitions (i.e., test metrics represent performance on unseen lakes). Training data within each fold were then partitioned into a 90:10 split with 10% of each trophic class set aside for an inner-loop fold validation. Models within each fold were trained using an early stopping criteria of 20 epochs to avoid overfitting on the training data. This inner-fold validation was additionally used to hypertune the best number of epochs for the final models. Finally, overall error metrics were calculated based on the mean prediction accuracy of the test partitions withheld from the inner-loop training of each fold. All reported metrics are based on the test partitions from the spatial cross-validation routine while final models were trained on the full dataset using the hyperparameters identified from the grid-search and inner-loop validation routines.

Model diagnostics and performance were calculated using test data, but the final models used to create the final dataset were constructed using all of the data in LimnoSat-US. Once final models were validated for performance, we applied the final models to make predictions for all 56,000 lakes in the LimnoSat-US dataset.

Step 5: Assess and Compare Model Performance

To evaluate the final fitted models, we used test data predictions from the spatial-holdout routine to calculate each model’s overall and balanced accuracy, receiver-operator-characteristic (ROC) curves, as well as the area under the curve (AUC) of the ROC curve. Overall accuracy was calculated as the sum of true positives and true negatives divided by the total number of LTS predictions. Balanced accuracy was calculated as the sum of a true positive and true negative result for a single lake trophic state. Whereas overall accuracy can be biased towards more prevalent trophic states (i.e., eutrophic and oligotrophic lakes), balanced accuracy is useful to assess a model’s capacity to predict more rare trophic states (i.e., dystrophic lakes). As an additional metric of model performance, we calculated the AUC of each model’s ROC curve. The ROC curve visually graphs the relationship between the rate of a correct classification with the rate of a false classification. An AUC of 0.5 indicates a false prediction rate increases 1:1 with the rate of a correct prediction. AUCs greater than 0.5 imply a model performing better than random, even when a false positive rate is artificially inflated. Thus, comparing overall and balanced accuracy as well as ROC curves and AUCs allowed us to assess how models performed broadly as well as how robustly models predicted trophic state correctly.

Beyond model performance, we also evaluated whether model coefficients and variable importance for trophic state discrimination reflected NCP groupings. For increased interpretability across all three models, we employed SHAP (SHapley Additive exPlanation) analysis (Shapley 1953; Štrumbelj and Kononenko 2014; Lundberg and Lee 2017) to better understand individual feature importance and influence in model predictions. SHAP analysis yields insight into the marginal contribution of a given feature (e.g., near-infrared spectra) on model output - in this case trophic state - and helps decode ‘black box’ results. Understanding the relative contribution of individual features in trophic state prediction not only helps explain feature roles in model accuracy and misclassification but also quantitatively connects features, such as remotely sensed data, to the biophysical parameters in which LTS prediction is grounded. SHAP feature contribution was calculated for blue, green, red, and near-infrared Landsat spectra. SHAP feature contribution was scored for oligotrophic, dystrophic, and eutrophic classifications and across each of the three models. This scoring illuminates the relationship among feature values and SHAP contribution for a given trophic state classification and for a given model. Specifically, for classification problems, a positive SHAP value indicates that a given input contributed to a positive classification while a negative value indicates the input contributed to a low probability for a given classification.

Code Availability

All data harmonization, modeling, and validation procedures for the LTS-US dataset were scripted in the R Statistical Environment (R Core Team 2022), using the tidyverse (Wickham et al. 2019), lubridate (Grolemund and Wickham 2011), data.table (Dowle and Srinivasan 2021), sf (Pebesma 2018), keras (Allaire and Chollet 2022), tensorflow (Allaire and Tang 2022), caret (Kuhn 2022), CAST (Meyer et al. 2022), yaml (Garbett et al. 2022), reticulate (Ushey et al. 2022), xgboost (Chen et al. 2022), nnet (Venables and Ripley 2002), viridis (Garnier et al. 2021), trend (Pohlert 2020), multiROC (Wei and Wang 2018), ggpubr (Kassambara 2020), fastshap (Greenwell 2021), maps (Becker et al. 2021), ggtext (Wilke and Wiernik 2022), and ggforce (Pedersen 2022) packages.

To enhance reproducibility, all scripts are designed to work within a single pipeline that uses the targets package (Landau 2021). The targets pipeline is divided into four main components: “1_aggregate”, “2_train”, “3_predict”, and “4_qc”. Each component corresponds to one of the steps presented above and can be customized by future users to fit their specific needs. The associated pipeline setup and user guide can be found on the dataset’s companion Git repository, where the main ReadMe file details directory architecture and how to execute the pipeline.

To ensure reproducibility across operating platforms, all scripts for the pipeline can be executed within a container. Running the pipeline within the container allows users to execute the entire pipeline without the need to make small, yet important, edits to the code, or to configure their own operating environment to conform to the pipeline’s requirements. For example, recent versions of the sf package default to using the s2 spherical geometry engine instead of GEOS, which assumes planar coordinates. End users on a system with one version of the sf library might need to adjust the code to use the correct geometry engine, whereas users with another version might be able to run the pipeline without any adjustments. The container crystallizes a known-working set of libraries, both at the system level (e.g. GEOS, GDAL, PROJ) and at the R level (e.g. sf), so that anybody can run the code without reconfiguring their own environment. This also provides future-proofing by ensuring that the inevitable changes to other libraries over time do not lead to errors. To help end users, who are less familiar with running containerized code, a tutorial for installing and executing the pipeline within the container is located in the Environmental Data Initiative repository as a compressed entity (see README.pdf).

References

Allaire, J. J., and F. Chollet. 2022. keras: R Interface to “Keras,.”

Allaire, J. J., and Y. Tang. 2022. tensorflow: R Interface to “TensorFlow,.”

APHA. 1999. Standard Methods for the Examination of Water and Wastewater. American Public Health Association, Washington DC., American Public Health Association.

Becker, O. S. code by R. A., A. R. W. R. version by R. B. E. by T. P. Minka, and A. Deckmyn. 2021. maps: Draw Geographical Maps,.

Carlson, R. E. 1977. A trophic state index for lakes. Limnol. Oceanogr. 22: 361–369. doi:10.4319/lo.1977.22.2.0361

Chen, T., T. He, M. Benesty, and others. 2022. xgboost: Extreme Gradient Boosting,.

Dowle, M., and A. Srinivasan. 2021. data.table: Extension of `data.frame`,.

Foga, S., P. L. Scaramuzza, S. Guo, and others. 2017. Cloud detection algorithm comparison and validation for operational Landsat data products. Remote Sens. Environ. 194: 379–390. doi:10.1016/j.rse.2017.03.026

Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29: 1189–1232. doi:10.1214/aos/1013203451

Garbett, S. P., J. Stephens, K. Simonov, and others. 2022. yaml: Methods to Convert R Data to YAML and Back,.

Garnier, Simon, Ross, and others. 2021. viridis - Colorblind-Friendly Color Maps for R,.

Greenwell, B. 2021. fastshap: Fast Approximate Shapley Values,.

Grolemund, G., and H. Wickham. 2011. Dates and Times Made Easy with lubridate. J. Stat. Softw. 40: 1–25.

Jones, J. W. 2019. Improved Automated Detection of Subpixel-Scale Inundation—Revised Dynamic Surface Water Extent (DSWE) Partial Surface Water Tests. Remote Sens. 11: 374. doi:10.3390/rs11040374

Jӓrnefelt, H. 1925. Zur Limnologie einiger Gewӓsser Finnlands. Soc Zool Bot Fenn. Vanamo 2: 185–352.

Kassambara, A. 2020. ggpubr: “ggplot2” Based Publication Ready Plots,.

Kuhn, M. 2022. caret: Classification and Regression Training,.

Landau, W. M. 2021. The targets R package: a dynamic Make-like function-oriented pipeline toolkit for reproducibility and high-performance computing. J. Open Source Softw. 6: 2959.

Leech, D. M., A. I. Pollard, S. G. Labou, and S. E. Hampton. 2018. Fewer blue lakes and more murky lakes across the continental U.S.: Implications for planktonic food webs. Limnol. Oceanogr. 63: 2661–2680. doi:10.1002/lno.10967

Lundberg, S. M., and S.-I. Lee. 2017. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems. Curran Associates, Inc.

Messager, M. L., B. Lehner, G. Grill, I. Nedeva, and O. Schmitt. 2016. Estimating the volume and age of water stored in global lakes using a geo-statistical approach. Nat. Commun. 7: 13603. doi:10.1038/ncomms13603

Meyer, H., C. Milà, and M. Ludwig. 2022. CAST: “caret” Applications for Spatial-Temporal Models,.

Naumann, E. 1917. Undersӧkningar ӧver fytoplankton och under den pelagiska regionen fӧsiggående gyttje-och dybildningar inom vissa syd- och mellansvenska urbergsvatten. K Sv Vetensk Akad Handl 56: 1–165.

Nürnberg, G. K., and M. Shaw. 1998. Productivity of clear and humic lakes: nutrients, phytoplankton, bacteria. Hydrobiologia 382: 97–112. doi:10.1023/A:1003445406964

Pebesma, E. 2018. Simple Features for R: Standardized Support for Spatial Vector Data. R J. 10: 439–446. doi:10.32614/RJ-2018-009

Pedersen, T. L. 2022. ggforce: Accelerating “ggplot2,.”

Pohlert, T. 2020. trend: Non-Parametric Trend Tests and Change-Point Detection,.

R Core Team. 2022. R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing.

Robinson, N., J. Regetz, and R. P. Guralnick. 2014. EarthEnv-DEM90: A nearly-global, void-free, multi-scale smoothed, 90m digital elevation model from fused ASTER and SRTM data. ISPRS J. Photogramm. Remote Sens. 87: 57–67. doi:10.1016/j.isprsjprs.2013.11.002

Rohde, W. 1969. Crystallization of Eutrophication Concepts in Northern Europe, p. 20256. In Eutrophication: Causes, Consequences, Correctives. National Academies Press.

Rosenblatt, F. 1958. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 65: 386–408. doi:10.1037/h0042519

Shapley, L. S. 1953. 17. A Value for n-Person Games, p. 307–318. In H.W. Kuhn and A.W. Tucker [eds.], Contributions to the Theory of Games (AM-28), Volume II. Princeton University Press.

Shen, Z., X. Yu, Y. Sheng, J. Li, and J. Luo. 2015. A Fast Algorithm to Estimate the Deepest Points of Lakes for Regional Lake Registration. PLOS ONE 10: e0144700. doi:10.1371/journal.pone.0144700

Štrumbelj, E., and I. Kononenko. 2014. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 41: 647–665. doi:10.1007/s10115-013-0679-x

Thienemann, A. 1921. Seetypen. Naturwissenschaften 9.

Topp, S., T. Pavelsky, X. Yang, J. Gardner, and M. R. V. Ross. 2020. LimnoSat-US: A Remote Sensing Dataset for U.S. Lakes from 1984-2020.doi:10.5281/zenodo.4139695

USEPA. 1987. Handbook of Methods for Acid Deposition Studies: Laboratory Analyses for Surface Water Chemistry, U.S. Environmental Protection Agency, Office of Research and Development.

USEPA. 2007. Survey of the Nation’s Lakes. Field Operations Manual. EPA 841-B-07004. EPA 841-B-07004 U.S. Environemtnal Protection Agency.

USEPA. 2011. 2012 National Lakes Assessment. Field Operations Manual. EPA 841-B-11-003. EPA 841-B-11-003 U.S. Environemtnal Protection Agency.

USEPA. 2012. National Lakes Assessment. Laboratory Operations Manual. EPA-841-B-11-004. EPA-841-B-11-004 U.S. Environemtnal Protection Agency.

USEPA. 2017a. National Lakes Assessment 2017. Field Operations Manual. EPA 841-B-16-002. EPA 841-B-16-002 U.S. Environemtnal Protection Agency.

USEPA. 2017b. National Lakes Assessment 2017. Laboratory Operations Manual. V.1.1. EPA 841‐B‐16‐ 004. EPA 841‐B‐16‐ 004 U.S. Environemtnal Protection Agency.

Ushey, K., J. J. Allaire, and Y. Tang. 2022. reticulate: Interface to “Python,.”

Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S, Fourth. Springer.

Webster, K. E., P. A. Soranno, K. S. Cheruvelil, and others. 2008. An empirical evaluation of the nutrient-color paradigm for lakes. Limnol. Oceanogr. 53: 1137–1148. doi:10.4319/lo.2008.53.3.1137

Wei, R., and J. Wang. 2018. multiROC: Calculating and Visualizing ROC and PR Curves Across Multi-Class Classifications,.

Wetzel, R. G. 2001. Limnology: Lake and River Ecosystems, 3rd edition. Academic Press.

Wickham, H., M. Averick, J. Bryan, and others. 2019. Welcome to the tidyverse. J. Open Source Softw. 4: 1686. doi:10.21105/joss.01686

Wilke, C. O., and B. M. Wiernik. 2022. ggtext: Improved Text Rendering Support for “ggplot2,.”

Willard, J. D., J. S. Read, A. P. Appling, S. K. Oliver, X. Jia, and V. Kumar. 2021. Predicting Water Temperature Dynamics of Unmonitored Lakes With Meta-Transfer Learning. Water Resour. Res. 57: e2021WR029579. doi:10.1029/2021WR029579

Williamson, C. E., D. P. Morris, M. L. Pace, and O. G. Olson. 1999. Dissolved organic carbon and nutrients as regulators of lake ecosystems: Resurrection of a more integrated paradigm. Limnol. Oceanogr. 44: 795–803. doi:10.4319/lo.1999.44.3_part_2.0795

People and Organizations

Publishers:

Organization:

Environmental Data Initiative

Email Address:

info@edirepository.org

Web Address:

https://edirepository.org

Id:

https://ror.org/0330j0z60

Creators:

Individual:

Dr Michael F Meyer

Organization:

U.S. Geological Survey

Position:

Research Geographer

Address:

1 Gifford Pinchot Dr,

Madison, Wisconsin 53562 United States

Phone:

3142582927 (voice)

Email Address:

mfmeyer@usgs.gov

Id:

https://orcid.org/0000-0002-8034-9434

Individual:

Dr Simon N Topp

Organization:

U.S. Geological Survey

Position:

Research Physical Scientist

Address:

102 Eugene St,

Carrboro, NC 27510 USA

Email Address:

stopp@usgs.gov

Id:

https://orcid.org/0000-0001-7741-5982

Individual:

Dr. Tyler V King

Organization:

U.S. Geological Survey

Position:

Hydrologist

Address:

230 N. Collins Rd,

Boise, Idaho 83702 United States

Email Address:

tvking@usgs.gov

Id:

https://orcid.org/0000-0002-5785-3077

Individual:

Dr Robert Ladwig

Organization:

Center for Limnology

Position:

Post Doctoral Researcher

Address:

680 N Park St,

Madison, WI 53706 USA

Email Address:

rladwig2@wisc.edu

Id:

https://orcid.org/0000-0001-8443-1999

Individual:

Dr Rachel M Pilla

Organization:

Oak Ridge National Lab

Address:

1 Bethel Valley Road,

Oak Ridge, TN 37831 USA

Email Address:

pillarm@ornl.gov

Id:

https://orcid.org/0000-0001-9156-9486

Individual:

Dr Hilary A Dugan

Organization:

Center for Limnology

Position:

Associate Professor

Address:

680 N Park St,

Madison, WI 53706 USA

Email Address:

hdugan@wisc.edu

Id:

https://orcid.org/0000-0003-4674-1149

Individual:

Dr Jack R Eggleston

Organization:

U.S. Geological Survey

Position:

Branch Chief, Hydrologic Remote Sensing Branch

Address:

11649 Leetown Road,

Kearneysville, WV 25430 United States

Email Address:

jegglest@usgs.gov

Id:

https://orcid.org/0000-0001-6633-3041

Individual:

Dr Stephanie E Hampton

Organization:

Carnegie Institution for Science

Position:

Deputy Director

Address:

1200 E. California Blvd. ,

Pasadena, CA 91125 USA

Email Address:

shampton@carnegiescience.edu

Id:

https://orcid.org/0000-0003-2389-4249

Individual:

Dr Dina M Leech

Organization:

Longwood University

Position:

Associate Professor

Address:

201 High Street,

Farmville, VA 23909 USA

Email Address:

leechdm@longwood.edu

Id:

https://orcid.org/0000-0002-0674-3433

Individual:

Dr Isabella A Oleksy

Organization:

University of Wyoming

Position:

Post Doctoral Researcher

Address:

1000 E University Ave,

Laramie, WY 82071 USA

Email Address:

bellaoleksy@gmail.com

Id:

https://orcid.org/0000-0003-2572-5457

Individual:

Jesse C Ross

Organization:

U.S. Geological Survey

Address:

Los Angeles, CA 90018 United States

Email Address:

jross@usgs.gov

Id:

https://orcid.org/0000-0002-5422-8284

Individual:

Dr Matthew RV Ross

Organization:

Colorado State University

Address:

Fort Collins, CO 80523 United States

Email Address:

mrvr@colostate.edu

Id:

https://orcid.org/0000-0001-9105-4255

Individual:

Dr R Iestyn Woolway

Organization:

Bangor University

Position:

Assistant Professor

Address:

Askew St,

Menai Bridge, Anglesey LL59 5AB United Kingdom

Email Address:

iestyn.woolway@bangor.ac.uk

Id:

https://orcid.org/0000-0003-0498-7968

Individual:

Dr Xiao Yang

Organization:

Southern Methodist University

Position:

Assistant Professor

Address:

P.O. Box 750395,

Dallas, TX USA

Email Address:

xnayang@smu.edu

Id:

https://orcid.org/0000-0002-0046-832X

Individual:

Matthew R Brousil

Organization:

Colorado State University

Address:

Fort Collins, CO 80523 United States

Email Address:

mbrousil@colostate.edu

Id:

https://orcid.org/0000-0001-8229-9445

Individual:

Dr Kate C Fickas

Organization:

U.S. Geological Survey

Address:

47914 252nd St,

Sioux Falls, SD 57198 United States

Email Address:

kfickas@usgs.gov

Id:

https://orcid.org/0000-0002-6617-2441

Individual:

Dr Julie C Padowski

Organization:

Washington State University

Address:

2001 Grimes Way,

Pullman, WA 93164 United States

Email Address:

julie.padowski@wsu.edu

Id:

https://orcid.org/0000-0003-2337-4243

Individual:

Dr Amina I Pollard

Organization:

U.S. Environmental Protection Agency

Address:

1200 Pennsylania Ave,

Washington, DC, 20460 United States

Email Address:

pollard.amina@epa.gov

Id:

https://orcid.org/0000-0002-5010-0961

Individual:

Dr Jianning Ren

Organization:

University of Nevada - Reno

Position:

Post Doctoral Researcher

Address:

1664 N. Virginia Street,

Reno, NV 89557 USA

Email Address:

nren@unr.edu

Id:

https://orcid.org/0000-0002-5849-2189

Individual:

Dr Jacob A Zwart

Organization:

U.S. Geological Survey

Position:

Research Data Scientist

Address:

2367 44th Ave,

San Franisco, CA 94116 United States

Email Address:

jzwart@usgs.gov

Id:

https://orcid.org/0000-0002-3870-405X

Contacts:

Individual:

Dr Michael F Meyer

Organization:

U.S. Geological Survey

Position:

Research Geographer

Address:

1 Gifford Pinchot Dr,

Madison, Wisconsin 53726 United States

Phone:

3142582927 (voice)

Email Address:

mfmeyer@usgs.gov

Id:

https://orcid.org/0000-0002-8034-9434

Metadata Providers:

Individual:

Dr Michael F Meyer

Organization:

U.S. Geological Survey

Position:

Research Geographer

Address:

1 Gifford Pinchot Dr,

Madison, Wisconsin 53726 United States

Phone:

3142582927 (voice)

Email Address:

mfmeyer@usgs.gov

Id:

https://orcid.org/0000-0002-8034-9434

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:

Time Period

Begin:

1984

End:

2020

Geographic Region:

Description:

Contiguous United States and southern Canada

Bounding Coordinates:

Northern:	49.22	Southern:	24.55
Western:	-125.54	Eastern:	-65.41

Project

Parent Project Information:

Title:

Remote Sensing of Water Quality

Personnel:

Individual:

Dr Michael F Meyer

Organization:

U.S. Geological Survey

Position:

Research Geographer

Address:

1 Gifford Pinchot Dr,

Madison, Wisconsin 53718 United States

Phone:

3142582927 (voice)

Email Address:

mfmeyer@usgs.gov

Id:

https://orcid.org/0000-0002-8034-9434

Role:

Project Lead

Maintenance

Maintenance:

Description:	Data have been completed, and this dataset is static unless other persons, inclusive to co-authors, would like to update the dataset.
Frequency:	notPlanned

Other Metadata

Additional Metadata

additionalMetadata
        |___text '\n    '
        |___element 'metadata'
        |     |___text '\n      '
        |     |___element 'unitList'
        |     |     |___text '\n        '
        |     |     |___element 'unit'
        |     |     |     |  \___attribute 'id' = 'unitless'
        |     |     |     |  \___attribute 'name' = 'unitless'
        |     |     |     |___text '\n          '
        |     |     |     |___element 'description'
        |     |     |     |     |___text 'probability'
        |     |     |     |___text '\n        '
        |     |     |___text '\n        '
        |     |     |___element 'unit'
        |     |     |     |  \___attribute 'id' = 'year'
        |     |     |     |  \___attribute 'name' = 'year'
        |     |     |     |___text '\n          '
        |     |     |     |___element 'description'
        |     |     |     |___text '\n        '
        |     |     |___text '\n      '
        |     |___text '\n    '
        |___text '\n  '

Additional Metadata

additionalMetadata
        |___text '\n    '
        |___element 'metadata'
        |     |___text '\n      '
        |     |___element 'emlEditor'
        |     |        \___attribute 'app' = 'ezEML'
        |     |        \___attribute 'release' = '2023.02.19'
        |     |___text '\n    '
        |___text '\n  '

Copyright 2024 Environmental Data Initiative. This material is based upon work supported by the National Science Foundation under grants #2223103 and #2223104. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Please contact us with questions, comments, or for technical assistance regarding this web site or the Environmental Data Initiative. Please read our privacy policy to know what information we collect about you and to understand your privacy rights.

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

Data Package Metadata View Summary

National-scale, remotely sensed lake trophic state (LTS-US) 1984-2020

Data Entities

Data Table

Data Table

Non-Categorized Data Resource

Non-Categorized Data Resource

Non-Categorized Data Resource

Non-Categorized Data Resource

Non-Categorized Data Resource

Data Package Usage Rights

Keywords

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

People and Organizations

Temporal, Geographic and Taxonomic Coverage

Project

Parent Project Information:

Maintenance

Additional Metadata

Additional Metadata

Recently Added

Recently Updated

Data Package Metadata View Summary

National-scale, remotely sensed lake trophic state (LTS-US) 1984-2020

+/- Data Entities

Data Table

Data Table

Non-Categorized Data Resource

Non-Categorized Data Resource

Non-Categorized Data Resource

Non-Categorized Data Resource

Non-Categorized Data Resource

+/- Data Package Usage Rights

+/- Keywords

+/- Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

+/- People and Organizations

+/- Temporal, Geographic and Taxonomic Coverage

+/- Project

Parent Project Information:

+/- Maintenance

+/- Additional Metadata

+/- Additional Metadata

Data Entities

Data Package Usage Rights

Keywords

Methods and Protocols

People and Organizations

Temporal, Geographic and Taxonomic Coverage

Project

Maintenance

Additional Metadata

Additional Metadata