Data Package Metadata   View Summary

LAGOS-US NETWORKS v1.0: Data module of surface water networks characterizing connections among lakes, streams, and rivers in the conterminous U.S

General Information
Data Package:
Local Identifier:edi.213.3
Title:LAGOS-US NETWORKS v1.0: Data module of surface water networks characterizing connections among lakes, streams, and rivers in the conterminous U.S
Alternate Identifier:DOI PLACE HOLDER
Abstract:

Knowing the degree of surface water connectivity among aquatic ecosystems can help scientists better understand and predict the movement of materials and biota across ecosystems. Methods to quantify surface water networks that include lake and stream connections at broad spatial scales are rare because it is difficult to balance accurate estimates of surface water connectivity and computational challenges. The LAGOS-US NETWORKS (NETS) module contains surface connectivity metrics for lake networks across the conterminous United States. We applied a graph theory approach to identify lake networks (i.e. a set of lakes connected by streams either upstream, downstream, or both) created from the medium resolution NHD lakes, streams, and rivers and subsequently derive surface water connectivity metrics for lakes and networks. Using this approach, we created a total of 898 networks that include 86,511 lakes. The NETS module includes a table with metrics for connections between lakes (both upstream and downstream), dams, network position, and whole networks. NETS also includes a flow table and bidirectional and unidirectional distance tables that provide the distances between every pair of connected lakes.

Publication Date:2021-05-12

People and Organizations
Contact:King, Katelyn B.S. (Michigan State University) [  email ]
Creator:King, Katelyn B.S. (Michigan State University)
Creator:Wang, Qi (Michigan State University)
Creator:Rodriguez, Lauren K (Michigan State University)
Creator:Haite, Maggie (Michigan State University)
Creator:Danila, Laura (Michigan State University)
Creator:Tan, Pang-Ning (Michigan State University)
Creator:Zhou, Jiayu (Michigan State University)
Creator:Cheruvelil, Kendra S (Michigan State University)
Associate:Infante, Dana (Michigan State University, Dam Dataset Provider)
Associate:Cooper, Arthur (Michigan State University, Dam Dataset Provider)
Associate:Hawkins, Arika (Michigan State University, Hourly Assistant For Manual Dam Classification)
Associate:Webster, Katherine E (Michigan State University, Qaqc Support)
Associate:Smith, Nicole (Michigan State University, Gis Support)

Data Entities
Data Table Name:
nets_networkmetrics_medres.csv
Description:
Table of variables describing lake connectivity to other lakes or dams via streams/rivers and network-scale metrics
Data Table Name:
nets_uninetworkdistance_medres.csv
Description:
Table of stream course distance (in kilometers) between every pair of lakes, where stream traversal is in one direction (i.e., distance downstream).
Data Table Name:
nets_binetworkdistance_medres.csv
Description:
Table of stream course distance (in kilometers) between every pair of lakes, regardless of direction (i.e., this distance includes the combination of upstream and downstream courses).
Data Table Name:
nets_flow_medres.csv
Description:
Table of stream and lake identifiers characterizing the downstream flow between surface water bodies.
Other Name:
Edi_metadata_nets.pdf
Description:
Edi_metadata_nets for LAGOS-US_v1.0
Other Name:
NETWORKS_GUIDE_LAGOS_US.pdf
Description:
NETWORKS_GUIDE for LAGOS-US_v1.0
Other Name:
data_dictionary_nets.xlsx
Description:
Provides a definition for each variable name or ‘column’ of every table in the module, and includes other useful information such as units
Other Name:
source_table_nets.xlsx
Description:
Includes a description of the sources used to create NETWORKS
Detailed Metadata

Data Entities


Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/eb1c4a78df8140b89cbf053b0cef3976
Name:nets_networkmetrics_medres.csv
Description:Table of variables describing lake connectivity to other lakes or dams via streams/rivers and network-scale metrics
Number of Records:86511
Number of Columns:22

Table Structure
Object Name:nets_networkmetrics_medres.csv
Size:10861677 bytes
Authentication:e2be29a4b69b6a3a8553c20aabfc75cd Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:lagoslakeid  
lake_nets_upstreamlake_km  
lake_nets_downstreamlake_km  
lake_nets_bidirectionallake_km  
lake_nets_upstreamlake_n  
lake_nets_downstreamlake_n  
lake_nets_lakeorder  
lake_nets_lnn  
net_id  
net_lakes_n  
net_averagelakedistance_km  
net_averagelakearea_ha  
lake_nets_nearestdamdown_km  
lake_nets_nearestdamdown_id  
lake_nets_totaldamdown_n  
lake_nets_nearestdamup_km  
lake_nets_nearestdamup_id  
lake_nets_totaldamup_n  
lake_nets_damonlake_flag  
lake_nets_multidam_flag  
net_dams_n  
nhdplusv2_comid  
Definition:Unique lake identifier developed by LAGOS-USDistance to nearest upstream lake using a unidirectional graph.Distance to nearest downstream lake using a unidirectional graph.Distance to the nearest lake upstream or downstream using bi-directional graph.The number of upstream lakes directly connected through streams to a lake.The number of lakes directly connected through streams downstream of a lake.Lake order follows the Strahler stream order of the stream that flows from it (outflowing), where the higher order stream is chosen if more than one outlet occurs (Riera et al. 2000, Martin and Soranno 2006). The exceptions are that headwater lakes are 0 and terminal lakes receive the order of the highest inflowing stream.Lake network number (LNN) is the position of a lake within the network in reference to other lakes. The lake at the top of a network (i.e. no upstream lakes) will be 1, the next lake downstream will be 2, etc. If a lake has more than one lake upstream it will take the higher LNN.The unique identifier assigned by LAGOS-NETS for each networkThe total number of lakes in the lake network.Average distance between lakes in a network.Average lake area in a network.Distance to nearest downstream dam.The NABD dam ID for the nearest downstream dam.The total number of downstream dams.Distance to nearest upstream dam.The NABD dam ID for the nearest upstream dam.The total number of upstream dams.A value of ‘Y’ indicates that there is at least one dam on this lake. This means that the dam point falls onto one of the artificial flowlines that flows through a lake and is therefore associated with the lake and not a stream reach. An “N” indicates no flag.A value of ‘Y’ indicates that there are multiple dams on a lake. An “N” indicates no flag.The number of total dams in a network.Unique lake identifier from the nhd for the medium resolution NHDplusV2.
Storage Type:float  
float  
float  
float  
float  
float  
float  
float  
string  
float  
float  
float  
float  
string  
float  
float  
string  
float  
string  
string  
float  
string  
Measurement Type:ratioratioratioratioratioratioratiorationominalratioratioratiorationominalratiorationominalrationominalnominalrationominal
Measurement Values Domain:
Unitdimensionless
Typenatural
Min
Max483389 
Unitkilometer
Typereal
Min
Max220.334 
Unitkilometer
Typereal
Min
Max2412.315 
Unitkilometer
Typereal
Min
Max284.717 
Unitnumber
Typewhole
Min
Max7310 
Unitnumber
Typewhole
Min
Max13 
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typenatural
Min
Max50 
DefinitionThe unique identifier assigned by LAGOS-NETS for each network
Unitnumber
Typenatural
Min
Max32811 
Unitkilometer
Typereal
Min0.015 
Max1652.416 
Unithectare
Typereal
Min1.19 
Max47157.153 
Unitkilometer
Typereal
Min
Max1703.461 
DefinitionThe NABD dam ID for the nearest downstream dam.
Unitnumber
Typewhole
Min
Max32 
Unitkilometer
Typereal
Min
Max269.745 
DefinitionThe NABD dam ID for the nearest upstream dam.
Unitnumber
Typewhole
Min
Max6027 
Allowed Values and Definitions
Enumerated Domain 
Code Definition
CodeN
Definitionno
Source
Allowed Values and Definitions
Enumerated Domain 
Code Definition
CodeN
Definitionno
Source
Unitnumber
Typewhole
Min
Max24986 
DefinitionUnique lake identifier from the nhd for the medium resolution NHDplusV2.
Missing Value Code:
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
 
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
 
CodeNA
Explmissing
CodeNA
Explmissing
 
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
Accuracy Report:                                            
Accuracy Assessment:                                            
Coverage:                                            
Methods:                                            

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/7ec02af96e1ef066455cf41caebf59d2
Name:nets_uninetworkdistance_medres.csv
Description:Table of stream course distance (in kilometers) between every pair of lakes, where stream traversal is in one direction (i.e., distance downstream).
Number of Records:124251
Number of Columns:3

Table Structure
Object Name:nets_uninetworkdistance_medres.csv
Size:2612858 bytes
Authentication:6b82e46d6728176346b1bbabaf278654 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:lagoslakeid  
to_lagoslakeid  
streamlength_down_km  
Definition:Identifier of lake 1 (lagoslakeid) that is connected to the lake 2 using a unidirectional graphIdentifier of lake 2 (lagoslakeid) connected to lake 1 using a unidirectional graphDistance downstream from lake 1 to lake 2 (as indicated by lagoslakeid) using a unidirectional graph
Storage Type:float  
float  
float  
Measurement Type:ratioratioratio
Measurement Values Domain:
Unitdimensionless
Typenatural
Min
Max483389 
Unitdimensionless
Typenatural
Min
Max483334 
Unitkilometer
Typereal
Min
Max3738.51300000001 
Missing Value Code:
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
Accuracy Report:      
Accuracy Assessment:      
Coverage:      
Methods:      

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/20d37c3f72ffb78945f1d85b4806f975
Name:nets_binetworkdistance_medres.csv
Description:Table of stream course distance (in kilometers) between every pair of lakes, regardless of direction (i.e., this distance includes the combination of upstream and downstream courses).
Number of Records:39498506
Number of Columns:6

Table Structure
Object Name:nets_binetworkdistance_medres.csv
Size:1924505226 bytes
Authentication:43e4107df7637b2936481f459844bf51 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:V1  
lagoslakeid  
to_lagoslakeid  
streamlength_total_km  
streamlength_up_km  
streamlength_down_km  
Definition:Ordered numbersIdentifier of lake 1 (lagoslakeid) that is connected to the lake 2 using a using a bidirectional graphIdentifier of lake 2 (lagoslakeid) connected to the lake 1 using a bidirectional graphTotal stream distance from lake 1 to lake 2 (as indicated by lagoslakeid) using a bidirectional graph.Distance upstream from lake 1 to lake 2 (as indicated by lagoslakeid) using a bidirectional graphDistance downstream from lake 1 to lake 2 (as indicated by lagoslakeid) using a bidirectional graph
Storage Type:float  
float  
float  
float  
float  
float  
Measurement Type:ratioratioratioratioratioratio
Measurement Values Domain:
Unitdimensionless
Typenatural
Min
Max39498506 
Unitdimensionless
Typenatural
Min
Max483389 
Unitdimensionless
Typenatural
Min
Max483334 
Unitkilometer
Typereal
Min
Max4681.541 
Unitkilometer
Typereal
Min
Max3825.565 
Unitkilometer
Typereal
Min
Max3914.139 
Missing Value Code:
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
CodeNA
Explmissing
Accuracy Report:            
Accuracy Assessment:            
Coverage:            
Methods:            

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/d9cf897fd7461d565a9c506575262fc4
Name:nets_flow_medres.csv
Description:Table of stream and lake identifiers characterizing the downstream flow between surface water bodies.
Number of Records:2722347
Number of Columns:4

Table Structure
Object Name:nets_flow_medres.csv
Size:75786759 bytes
Authentication:a9f9a7cfcc4ec87b908b89bdc69a3499 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:from_comid  
to_comid  
from_lagoslakeid  
to_lagoslakeid  
Definition:Common identifier of the upstream NHDFlowline featureCommon identifier of the downstream NHDFlowline featureIdentifier of the upstream lake as indicated by lagoslakeidIdentifier of the downstream lake as indicated by lagoslakeid
Storage Type:string  
string  
float  
float  
Measurement Type:nominalnominalratioratio
Measurement Values Domain:
DefinitionCommon identifier of the upstream NHDFlowline feature
DefinitionCommon identifier of the downstream NHDFlowline feature
Unitdimensionless
Typenatural
Min
Max483405 
Unitdimensionless
Typenatural
Min
Max483389 
Missing Value Code:    
CodeNA
Explmissing
CodeNA
Explmissing
Accuracy Report:        
Accuracy Assessment:        
Coverage:        
Methods:        

Non-Categorized Data Resource

Name:Edi_metadata_nets.pdf
Entity Type:unknown
Description:Edi_metadata_nets for LAGOS-US_v1.0
Physical Structure Description:
Object Name:Edi_metadata_nets.pdf
Size:240402 bytes
Authentication:2c61e0541c60e536b1eb24242dd5cc56 Calculated By MD5
Externally Defined Format:
Format Name:application/pdf
Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/2ab244bad0e5c8516d7244ed8b3030f8

Non-Categorized Data Resource

Name:NETWORKS_GUIDE_LAGOS_US.pdf
Entity Type:unknown
Description:NETWORKS_GUIDE for LAGOS-US_v1.0
Physical Structure Description:
Object Name:NETWORKS_GUIDE_LAGOS_US.pdf
Size:1686172 bytes
Authentication:cca2c421eb0d8345230f60675495da7d Calculated By MD5
Externally Defined Format:
Format Name:application/pdf
Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/cccece44a557c841c20a12764db746f7

Non-Categorized Data Resource

Name:data_dictionary_nets.xlsx
Entity Type:unknown
Description:Provides a definition for each variable name or ‘column’ of every table in the module, and includes other useful information such as units
Physical Structure Description:
Object Name:data_dictionary_nets.xlsx
Size:79598 bytes
Authentication:bae972d8f5b0ad3e45b485520ecd19ee Calculated By MD5
Externally Defined Format:
Format Name:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/c123ad8b5642b9d93589cee314a0d301

Non-Categorized Data Resource

Name:source_table_nets.xlsx
Entity Type:unknown
Description:Includes a description of the sources used to create NETWORKS
Physical Structure Description:
Object Name:source_table_nets.xlsx
Size:10939 bytes
Authentication:ce4606c8f283def8a4b70dac92418c04 Calculated By MD5
Externally Defined Format:
Format Name:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Data:https://pasta-s.lternet.edu/package/data/eml/edi/213/3/3a5f96d5c0c9f24ecef837438731f68c

Data Package Usage Rights

This information is released under the Creative Commons license - Attribution - CC BY (https://creativecommons.org/licenses/by/4.0/). The consumer of these data ("Data User" herein) is required to cite it appropriately in any publication that results from its use. The Data User should realize that these data may be actively used by others for ongoing research and that coordination may be necessary to prevent duplicate publication. The Data User is urged to contact the authors of these data if any questions about methodology or results occur. Where appropriate, the Data User is encouraged to consider collaboration or co-authorship with the authors. The Data User should realize that misinterpretation of data may occur if used out of context of the original study. While substantial efforts are made to ensure the accuracy of data and associated documentation, complete accuracy of data sets cannot be guaranteed. All data are made available "as is." The Data User should be aware, however, that data are updated periodically and it is the responsibility of the Data User to check for new versions of the data. The data authors and the repository where these data were obtained shall not be liable for damages resulting from any use or misinterpretation of the data. Thank you.

Keywords

By Thesaurus:
LTER Controlled Vocabularylakes, streams, freshwater, limnology
(No thesaurus)dams, connectivity, networks, conterminous US, LAGOS

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package
Description:

3. Methods for LAGOS-US NETWORKS This section outlines the methods used for creating the NETWORKS module. We explain how we derived lake networks and their associated connectivity metrics and how NABD dam data was linked to our networks. For further technical detail on this process, or to reproduce this effort, users can consult the published scripts (Wang & King 2020).

3.1. Software used for NETWORKS creation We used a combination of Python 2.7.8 (Van Rossum & Drake 1998), ArcGIS 10.3 Desktop (ESRI 2014), and R version 3.6 to create LAGOS-US NETWORKS. The majority of the methods are associated with Python scripts which can be found at http://doi.org/10.5281/zenodo.4383172. Those scripts can be used to understand, reproduce, or adapt our methods. ArcGIS was used for some of the dam classifications, mapping, and verification of some metrics. We used the “nhdR” (Stachelek 2019) package to download NHDplusV2 data and the “hydrolinks” package (Winslow et al. 2018) to verify metrics in R.

3.2. Methods for creating the lake networks Lake networks across the conterminous U.S. were created using the flow table from the NHDPlusV2 database (USGS 2019). This flow table consists of every flowline (streams and artificial flowlines that go through lakes; Figure 5) either in the FROM column or TO column, denoting a direction of flow from one line to the other, as well as the distance for each connection between two flow lines. Prior to creating a graph, we removed coastline connections (Fcode 56600; McKay et al. 2012) so that the connectivity networks would not connect through the ocean, estuaries, or the Great Lakes, as well as IDs associated with the Great Lakes water bodies. Artificial flowlines (Figure 5) were linked to water bodies (nhdplusv2_comid) and these water bodies were linked to lagoslakeids using the lake_link table from the LAGOS_US_LOCUS module (Smith et al. 2020). Our modified version of the NHDPlusV2 flow table including where artificial flowlines are matched to lakes from the LAGOS-US database can be found as the nets_flow_medres data table.

We applied a graph theory framework to create lake networks from this flow table. Graphs are mathematical structures used to model pairwise relations between objects, or nodes. In our case, we are interested in modeling the pairs of lakes that are connected by streams. Two types of graphs can be used to model connections: unidirectional graphs consider either downstream or upstream connections and bidirectional graphs consider both downstream and upstream connections. We created lake networks using bidirectional graphs with both lakes and streams as nodes (Figure 6). We used Dijkstra's algorithm (Cormen et al. 2001) to traverse the graph both up and downstream starting at a given lake. During the traversal, if a node was a stream, we continued traversing the graph until the node was a lake. We saved the distance from the given lake to this lake and stopped traversing. If there were multiple paths to connect the same two lakes, the algorithm chose and saved the path with the shortest length. This process outputs all the connections of the given lake to its neighbor lakes. This process was repeated for every lake until the connections and stream course distances between all lakes were known.

All lakes that are connected to another lake, up or downstream, are considered part of one network. We assigned each of these networks a unique identification number (net_id). All of the stream course distances between pairs of lakes can be found in the nets_binetworkdistance_medres. The artificial flowline distances through lakes were not included in these distances. This table includes upstream, downstream, and total distance between two lakes. The total distance may be smaller than the sum of the upstream and downstream columns because the graph does not have information on where the stream reaches intersect each other, therefore, an intersecting stream reach is only counted once for the total distance, but may be included in both the downstream and upstream distance columns (Figure 7).

3.3. Methods for linking LAGOS-US NETWORKS with NABD dams The NABD is a dataset of large, anthropogenic barriers that are spatially linked to the NHDPlusV1 data product to facilitate analyses based on the NHD and National Inventory of Dams (NID) (Ostroff et al. 2013). Cooper et al. (2017) augmented this database with 170 additional dams from the USFWS Fish Passage Decision Support Tool and excluded ~250 dams that were identified as having been removed since the NABD was published (Rivers 2019). The dams were linked to the NHDPlusV2 flowlines and were incorporated into networks. Dams were assigned to a lagoslakeid if they were less than 50 m from a lake (Polus et al. in review). Dams that fall directly on a lake could not be considered as up- or downstream because they were on the node and therefore, did not have a direction in reference to that node. Therefore, these dams were assigned as upstream or downstream from a lake using two methods:

1) Using ArcGIS, lake inlets and outlets were identified using the start and end vertices associated with the artificial flowlines and extracted as points representing inlets and outlets. When multiple artificial paths were present, the uppermost artificial flowline was identified for inlet locations and the downstream-most artificial flowline was identified for outlet locations. For each dam point location, the nearest three inlets or outlets (combined) were identified using euclidean distance in the ArcGIS GenerateNear tool. If the nearest inlet was less than 250m away, and no outlets or other lakes were also nearby, the dam was automatically designated as upstream of the associated lake. An equivalent, symmetrical rule was applied for nearby outlets. If both inlets and outlets for the same lake were very near each other or an inlet or outlet for another lake was very near, the dam position was assigned for manual review. "Very near" was defined as follows: if the second closest junction is within 50m of the closest junction or if the second closest junction is within 100m of the closest one and the closest junction is within 25m of the dam. Methods are available as Python code within the LAGOS GIS Toolbox (http://github.com/cont-limno/LAGOS_GIS_Toolbox; national_outlets_inlets.py, dams_link_lake_junctions.py). There were 11,551 dams that were assigned upstream or downstream of a lake using this method.

2) The remaining dams (n=1,079) that could not be identified by the automated process described above in (1) were then manually classified by visual inspection of the dam location in comparison to the NHD polygons and flowlines and manually assigning them as either on the upstream or downstream side of a lake.

Two data flags were created during the process of linking dams to lakes and streams/rivers. These flags are for cases when dams fall onto an artificial flowline contained within a lake or when multiple dams fall on the same lake (Figure 8; Table 5).

3.4. Methods for connectivity metrics After creating the connectivity networks, several metrics were created at the lake scale using a unidirectional graph. Unidirectional graphs consider only downstream connections. For example, in Figure 9 there is a downstream distance between lake A and lake B that is the same distance upstream from lake B to lake A. The connection between lake B and lake C is not included because the unidirectional graph does not traverse both down and upstream. We used Dijkstra's algorithm (Cormen et al. 2001) to traverse the graph downstream only starting at a given lake. During the traversal, if a node was a stream, we continued traversing the graph until the node was a lake. We saved the distance from the given lake to this lake and stopped traversing. If there were multiple paths to connect the same two lakes, the algorithm chose and saved the path with the shortest length. This outputs all the connections of the given lake to its neighbor lakes. This process was repeated for every lake until the connections and stream course distances between all lakes were known. These stream course distances between two lakes using a unidirectional graph can be found in the nets_uninetworkdistance_medres table.

The nearest lake distance was determined by comparing the distance between each lake and all of its neighboring lakes and choosing the nearest distance upstream (Figure 10a) and the nearest distance downstream (Figure 10b) from the unidirectional graph. Note that not all lakes have both an upstream and downstream lake. The number of directly connected lakes upstream was computed as the indegree of a lake, i.e. the number of lakes upstream only connected through streams flowing into the lake. Similarly, the number of directly connected downstream lakes was calculated using the outdegree of a lake, i.e. lakes directly connected through streams flowing out of a lake. There are instances when a lake does not have any directly connected upstream or directly connected downstream lakes because the lake is only connected through the bidirectional graph to the lake network (e.g. Figure 9, lake C; n=7,617). Therefore, we also included the nearest bidirectional distance (Figure 10c). This distance is often the same as the nearest downstream or nearest upstream value, however, it can be different if the nearest lake is connected through a bidirectional graph (Figure 10c).

Two metrics that describe the position of a lake within the network and landscape were derived from the unidirectional graph: Lake Network Number (LNN; Figure 10d) and Lake Order (LO; Figure 10e) (Riera et al. 2000; Martin and Soranno 2006). LNN was computed by starting at the first lake in a network (e.g. no upstream lakes) and assigning that lake a “1”, then moving downstream in the network to another lake and assigning that lake a “2”, and so on. Therefore, multiple lakes in a network could be assigned a “1” if they did not have any upstream lakes. Lakes with multiple upstream lakes were assigned the larger sequential number (Martin and Soranno 2006). LO was assigned using the Strahler stream order from the NHDplusV2 attributes. LO follows the Strahler stream order of the outflowing stream, where the higher order stream is chosen if more than one outlet is present (Riera et al. 2000, Martin and Soranno 2006). There were two exceptions to this: headwater lakes were assigned a “0” and terminal lakes received the order of the inflowing stream (Riera et al. 2000; Martin and Soranno 2006). To differentiate between headwater lakes and lakes that had inflowing streams but not upstream lakes, we considered inflowing streams for LO calculation. There were instances when a loop between two lakes occurred (0.02% of all connections), for example lake A flows to lake B and lake B flows back to lake A. In these instances, we randomly removed one connection.

Several dam metrics were derived to characterize connectivity barriers. The Depth First Search (DFS; Cormen et al. 2001) algorithm was used to traverse the lake-stream network to find all the upstream dams and downstream dams. The DFS algorithm is a common computer science technique that is used for traversing graphs by starting at one node and exploring every branch of the graph. Dijkstra’s algorithm was used to compute the distance to the nearest upstream and downstream dams (Cormen 2001). Because we used a graph to create the network, the algorithm did not have the exact location of the dam on the stream reach, just the flowline it is located on. Therefore, when deriving the metrics for the nearest dam, the entire stream reach that the dam is located on was included in the distance calculation. Thus, there were instances when two or more dams fell on the same stream flowline (8.7 % occurrence). In these instances, all dams were considered as the nearest up- or downstream dams, they have the same distance from the lake, and all of the dam identifiers were included and separated by a semicolon. Similarly, if multiple dams were on a lake, all the dams were considered the nearest dam, all dam identifiers were included, and dams located on a lake were assigned the distance of 0 km.

The completed lake networks were traversed using the DFS algorithm that counts total on-network lakes, the average distances between lakes in a network, and the total number of dams in each lake network. The average area of the on-network lakes in was calculated using the area from LAGOS-US LOCUS v1 polygons (Smith et al. 2020), grouping lakes by networks, and then using the Calculate Geometry tool in ArcGIS.

Lake networks were created for NETWORKS based on the medium resolution NHDplusV2 flow data; therefore, connectivity may differ from connectivity metrics in LAGOS-US LOCUS that were created based on the NHD high resolution (Smith et al., 2020). Metrics were only included for lakes connected to other lakes, and therefore do not include isolated lakes or lakes that are only connected to streams.

People and Organizations

Publishers:
Organization:Environmental Data Initiative
Email Address:
info@environmentaldatainitiative.org
Web Address:
https://environmentaldatainitiative.org
Creators:
Individual: Katelyn B.S. King
Organization:Michigan State University
Email Address:
kingka21@msu.edu
Id:https://orcid.org/0000-0001-5471-842X
Individual: Qi Wang
Organization:Michigan State University
Email Address:
wangqi19@msu.edu
Id:https://orcid.org/0000-0002-0713-2677
Individual: Lauren K Rodriguez
Organization:Michigan State University
Email Address:
rodri683@msu.edu
Id:https://orcid.org/0000-0002-9337-6087
Individual: Maggie Haite
Organization:Michigan State University
Email Address:
haitemar@msu.edu
Id:https://orcid.org/0000-0002-7490-8850
Individual: Laura Danila
Organization:Michigan State University
Email Address:
danilala@msu.edu
Id:https://orcid.org/0000-0002-7052-4251
Individual: Pang-Ning Tan
Organization:Michigan State University
Email Address:
ptan@msu.edu
Id:https://orcid.org/0000-0003-3205-0339
Individual: Jiayu Zhou
Organization:Michigan State University
Email Address:
jiayuz@msu.edu
Id:https://orcid.org/0000-0003-4336-6777
Individual: Kendra S Cheruvelil
Organization:Michigan State University
Email Address:
ksc@msu.edu
Id:https://orcid.org/0000-0003-1880-2880
Contacts:
Individual: Katelyn B.S. King
Organization:Michigan State University
Email Address:
kingka21@msu.edu
Id:https://orcid.org/0000-0001-5471-842X
Associated Parties:
Individual: Dana Infante
Organization:Michigan State University
Email Address:
infanted@msu.edu
Role:Dam Dataset Provider
Individual: Arthur Cooper
Organization:Michigan State University
Email Address:
coopera6@msu.edu
Role:Dam Dataset Provider
Individual: Arika Hawkins
Organization:Michigan State University
Email Address:
hawki267@msu.edu
Role:Hourly Assistant For Manual Dam Classification
Individual: Katherine E Webster
Organization:Michigan State University
Email Address:
katherine.e.webster@gmail.com
Role:Qaqc Support
Individual: Nicole Smith
Organization:Michigan State University
Email Address:
nicole.j.smith@gmail.com
Role:Gis Support

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:
Geographic Region:
Description:Conterminous U.S.(lower 48 states and the District of Columbia)
Bounding Coordinates:
Northern:  49Southern:  25
Western:  -67Eastern:  -125

Project

Parent Project Information:

Title:A macrosystems ecology framework for continental- scale prediction and understanding of lakes
Personnel:
Individual: Kendra S Cheruvelil
Organization:Michigan State University
Email Address:
ksc@msu.edu
Id:https://orcid.org/0000-0003-1880-2880
Role:Principal Investigator
Funding: US National Science Foundation US NSF Macrosystems Biology Program grants, DEB‐1638679; DEB‐1638550, DEB‐1638539, DEB‐1638554

Maintenance

Maintenance:
Description:completed
Frequency:
Other Metadata

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

UNM logo UW-M logo