Data Package Metadata View Summary

Baltimore Ecosystem Study: Stream biofilm bacterial community composition

General Information

Data Package:
Local Identifier:	knb-lter-bes.2080.190
Title:	Baltimore Ecosystem Study: Stream biofilm bacterial community composition
Alternate Identifier:	DOI PLACE HOLDER
Abstract:	The Baltimore Ecosystem Study stream biofilm bacterial community composition was obtained from 8 long-term sampling network sites in and near the Gwynns Falls watershed to examine how bacterial communities differ along an urban-rural gradient. Sampling was conducted at the same time as stream chemistry sampling on 18 June 2014 and 21 Oct 2014. Note: biofilm samples were taken about 50 meters east from the Carroll Park monitoring station, just under the I95 highway overpass, due to high water depth, high water flow, and lack of rock substrates for sampling. This dataset presents the number of sequences matching the taxonomic classifications in a reference database of 16S rRNA genes. See the full metadata record for detailed methods.
Publication Date:	2021-04-27

Time Period

Begin:

2014-06-18

End:

2014-10-21

People and Organizations
Contact:	Baltimore Ecosystem Study Information Manager(Cary Institute Of Ecosystem Studies) [ email ]
Creator:	Lee, Sylvia (Cary Institute Of Ecosystem Studies)
Creator:	Rosi, Emma (Cary Institute Of Ecosystem Studies)
Creator:	Kelly, John

Data Entities
Data Table Name:	BES_stream_bacteria
Description:	BES stream bacteria data table

Detailed Metadata

Data Entities

Data Table


Data:	https://pasta-s.lternet.edu/package/data/eml/knb-lter-bes/2080/190/2d5092ab9be2789f4b80319f1251f996
Name:	BES_stream_bacteria
Description:	BES stream bacteria data table
Number of Records:	1370
Number of Columns:	21

Table Structure

Object Name:

BES_stream_bacteria.csv

Size:

106959 bytes

Authentication:

87f0bfe4fd5e11fc0eecaa3486913d25 Calculated By MD5

Text Format:

Number of Header Lines:

Record Delimiter:

\r\n

Orientation:

column

Simple Delimited:

Field Delimiter:	,
Quote Character:	"

Table Column Descriptions

Column Name:

taxlevel

rankID

taxon

daughterlevels

total

POBR1

BARN1

GFGL1

GFGB1

GFVN1

DRKR1

GFCP1

GFGR1

POBR2

BARN2

GFGL2

GFGB2

GFVN2

DRKR2

GFCP2

GFGR2

Definition:

taxonomic classification level; 1=kingdom, 2=phylum, 3=class, 4=order, 5=family, 6=genus.

pedigree of the taxon lineage. For example, 0.1.1.1 and 0.1.1.2 are two taxa in the same phylum (0.1.1) but different classes

name of the taxon; unclassified=sequence with a 97% match in the reference database could not be found.

number of children lineages within a taxon. For example, a phylum with 24 daughterlevels has 24 distinct classes identified in the samples

total number of sequences.

Pond Branch forested reference site - Forested reference - 18-Jun-2014

Baisman Run at Ivy Hill Road - Suburban unsewered - 18-Jun-2014

Gwynns Falls at Glyndon - Suburban headwaters - 18-Jun-2014

Gwynns Falls at Gwynnbrook Avenue (Delight) - Suburban - 18-Jun-2014

Gwynns Falls at Villa Nova - Suburban/urban boundary - 18-Jun-2014

Dead Run at Krome Avenue - Urban - 18-Jun-2014

Gwynns Falls at Carroll Park - Urban - 18-Jun-2014

Gwynns Run at Gwynns Falls - Urban - 18-Jun-2014

Pond Branch forested reference site - Forested reference - 21-Oct-2014

Baisman Run at Ivy Hill Road - Suburban unsewered - 21-Oct-2014

Gwynns Falls at Glyndon - Suburban headwaters - 21-Oct-2014

Gwynns Falls at Gwynnbrook Avenue (Delight) - Suburban - 21-Oct-2014

Gwynns Falls at Villa Nova - Suburban/urban boundary - 21-Oct-2014

Dead Run at Krome Avenue - Urban - 21-Oct-2014

Gwynns Falls at Carroll Park - Urban - 21-Oct-2014

Gwynns Run at Gwynns Falls - Urban - 21-Oct-2014

Storage Type:

float

string

float

Measurement Type:

ratio

nominal

ratio

Measurement Values Domain:

Unit	dimensionless
Type	whole
Min	0
Max	6

Definition

pedigree of the taxon lineage. For example, 0.1.1.1 and 0.1.1.2 are two taxa in the same phylum (0.1.1) but different classes

Definition

name of the taxon; unclassified=sequence with a 97% match in the reference database could not be found.

Unit	dimensionless
Type	whole
Min	0
Max	37

Unit	dimensionless
Type	natural
Min	1
Max	1113696

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Unit	dimensionless
Type	whole
Min	0
Max	69606

Missing Value Code:

Accuracy Report:

Accuracy Assessment:

Coverage:

Methods:

Data Package Usage Rights

This information is released under the Creative Commons license - Attribution - CC BY (https://creativecommons.org/licenses/by/4.0/). The consumer of these data (\"Data User\" herein) is required to cite it appropriately in any publication that results from its use. The Data User should realize that these data may be actively used by others for ongoing research and that coordination may be necessary to prevent duplicate publication. The Data User is urged to contact the authors of these data if any questions about methodology or results occur. Where appropriate, the Data User is encouraged to consider collaboration or co-authorship with the authors. The Data User should realize that misinterpretation of data may occur if used out of context of the original study. While substantial efforts are made to ensure the accuracy of data and associated documentation, complete accuracy of data sets cannot be guaranteed. All data are made available \"as is.\" The Data User should be aware, however, that data are updated periodically and it is the responsibility of the Data User to check for new versions of the data. The data authors and the repository where these data were obtained shall not be liable for damages resulting from any use or misinterpretation of the data. Thank you.

Keywords

By Thesaurus:
BES Vocabulary	Baltimore, MD, Maryland, Baltimore Ecosystem Study, BES, LTER, Illumina, sequence, DNA
LTER Controlled Vocabulary	watersheds, urban, streams, bacteria, biodiversity
National Research & Development Taxonomy	Ecology, Ecosystems, & Environment, Environment and People , Urban natural resources management
ISO 19115 Topic Category	biota, environment

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package

Description:

Analysis of Stream Biofilm Diversity Part 1 of 3 - Sampling Network

Summary of Section 1: Sampling Network

1.1. Long-term stream monitoring sites along the Gwynns Falls and Oregon Ridge watersheds, excluding McDonogh.

Section 1: Sampling Network

Stream biofilm bacteria were collected from the BES long-term sampling network including 4 longitudinal sites along the Gwynns Falls watershed (excluding the Gwynns Falls tributary at McDonogh School) and 4 sites in smaller watersheds in or near the Gwynns Falls. The two watersheds provide a gradient of land use areas (forest, rural/suburban, urban). For more details on sampling sites, see below and refer to the metadata record for "Baltimore Ecosystem Study stream chemistry and stream flow overview, methods, and procedures."

Longitudinal sites along the Gwynns Falls

1.Boundary Station 1. Gwynns Falls at Glyndon. This site samples drainage from approximately 96 ha of suburban land at the headwaters of our main study stream.

2.Boundary Station 2. Gwynns Falls at Gwynnbrook/Delight. This site samples drainage from approximately 1,000 ha of old and new suburban and suburbanizing land use.

3.Boundary Station 3. Gwynns Falls at Villa Nova. This site samples drainage from approximately 7,400 ha of old and new suburban and suburbanizing land use. Streamflow at this station has been monitored continuously by the USGS since 1957 (with a hiatus from 1988 - 1995). This station is the boundary between the urban and suburban portions of the Gwynns Falls.

4.Boundary Station 4. Gwynns Falls at Route 1/Carroll Park. This sites samples drainage from approximately 16,000 ha of mixed suburban and urban watershed. The site has been monitored by USGS since 1994 and represents the boundary condition for entire Gwynns Falls above head of tidal influence. This station allows for evaluation of the urban portion of the Gwynns Falls watershed by comparison with the Gwynns Falls at Villa Nova station (station #3 above). Note: biofilm samples were taken about 50 meters east from the Carroll Park monitoring station, just under the I95 highway overpass, due to high water depth, high water flow, and lack of rock substrates for sampling.

Small watersheds

1.Small Watershed 1. Pond Branch. This is a completely forested "reference" 41 ha watershed located in a county park.

2.Small Watershed 3. Dead Run. The site samples high density urban residential land use.

3.Small Watershed 5. Baisman Run at Ivy Hill Road. This is a 381 ha, 80% forested watershed with unsewered residential land use (i.e., septic tank use) in the headwaters.

4.Small Watershed 8. Gwynns Run. This is an urban watershed that is a target of efforts to improve sanitary sewer infrastructure in the City of Baltimore. Samples are taken just above the confluence of this tributary with the main stem of the Gwynns Falls, approximately 200 m above the Carroll Park main stem station. It is a small, but highly contaminated tributary on the main stem.

Analysis of Stream Biofilm Diversity Part 2 of 3 - Sample Collection and Processing

Summary of Section 2: Sample Collection and Processing

2.1. Collection of Stream Biofilms

2.2. DNA Extraction

Section 2: Sample Collection and Processing

Stream biofilms were sampled seasonally using the following protocol:

1.Rinse all equipment (graduated cylinder, collection bin, scrubbing brushes, squeeze bottle, 60 ml sample bottles) in stream water and collect stream water in squeeze bottle upstream of rinsing area.

2.Collect 5 rocks from sampling site, making sure to collect all biofilm types available (different algal growth forms and colors), and from different locations within the site. In other words, do not collect rocks from only one spot and do not select only rocks with lots of biofilm growth. Select rocks that are at least 2 inches long but not too large (>6 or 7 inches).

3.Place rocks in collection bin and rinse with stream water using the squeeze bottle. Use scrubbing brushes to brush as much biofilm off the rocks as possible, breaking up any large algal filaments. Rinse rocks with additional stream water as needed.

4.Place scrubbed and rinsed rocks into plastic bag. When all rocks are scrubbed, rinse scrubbing brushes and gloves into collection bin.

5.Measure the volume of the slurry in collection bin using a large graduated cylinder. Note the volume in field notes.

6.Pour slurry back into collection bin and subsample using a turkey baster into sampling bottles, making sure to homogenize the slurry each time.

7.Use a pipette to subsample the slurry for bacteria into sterile microcentrifuge tubes.

8.Rinse graduated cylinder, collection bin, and scrubbing brushes in stream water.

9.Put samples on ice until further processing in the laboratory (e.g., chlorophyll a and ash-free dry mass analysis). Freeze bacterial samples in -80 degrees C until future processing for DNA extraction.

10. After subsamples are taken for chlorophyll a and ash-free dry mass analysis, preserve remaining sample in Lugol's Iodine or RNALater for diatom/algal composition analysis.

2.2. DNA Extraction and Sequencing

DNA extraction was performed following instructions in the MoBio Powersoil DNA extraction kit (MoBio Laboratories, Carlsbad, CA).

PCR amplification of partial bacterial 16S rRNA genes was performed using primers 515F and 806R, which amplify the V4 hypervariable region of bacterial and archaeal 16S rRNA genes. Amplicons were sequenced in a 2x250 base pair paired-end format using the Illumina MiSeq platform.

Sequencing of amplicons was performed by DNA Services Facility at the University of Illinois at Chicago (Chicago, IL). Paired reads were assembled and demultiplexed, and any sequences with ambiguities or homopolymers longer than 8 bases were removed from the dataset.

Analysis of Stream Biofilm Diversity Part 3 of 3 - Sequence Analysis

Summary of Section 3: Sequence Analysis

3.1. mothur

3.2. Taxonomic Classification

3.3. MiSeq Pipeline

Section 3: Sequence Analysis

Sequences were aligned using the SILVA-compatible alignment database available within MOTHUR. Sequences were trimmed to a uniform length of 275 base pairs, and chimeric sequences were removed using Uchime.

Sequences were processed to obtain the relative abundances of operational taxonomic units (OTUs; based on 97 % sequence identity) by using MOTHUR v.1.35.1.

For more details on MOTHUR and analysis of sequences generated using Illumina MiSeq platform, refer to:

http://www.mothur.org/wiki/Main_Page

http://www.mothur.org/wiki/MiSeq_SOP

Sequences were classified using the MOTHUR-formatted version of the 16S rRNA reference RDP training set (v.9), and any unknown (i.e., not identified as bacterial) chloroplast, mitochondrial, archaeal, and eukaryotic sequences were removed. Taxonomic classifications were made to the lowest level possible (down to genus) using the Bayesian classifier method by Wang et al. 2007.

Wang, Q., Garrity, G. M., Tiedje, J. M., and Cole, J. R. (2007). Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology, 73(16): 5261-5267. http://doi.org/10.1128/AEM.00062-07

3.3 MiSeq Pipeline

The following is a list of commands entered into the MOTHUR program to obtain the bacterial taxonomic classifications.

Make.contigs(file=BESB.files, processors=8)

summary.seqs(fasta=BESB.trim.contigs.fasta)

trim.seqs(fasta=current, oligos=ASBFIN.oligos, pdiffs=2)

list.seqs(fasta=current)

get.seqs(group=current)

summary.seqs(fasta=BESB.trim.contigs.trim.fasta)

screen.seqs(fasta=current, group=current, maxambig=0, maxlength=275)

summary.seqs(fasta=BESB.trim.contigs.trim.good.fasta)

unique.seqs(fasta=current)

count.seqs(name=current, group=current)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.fasta, count=BESB.trim.contigs.trim.good.count_table)

align.seqs(fasta=current, reference=silva.bacteria.fasta)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.align, count=current)

screen.seqs(fasta=current, count=current, summary=current, start=13862, end=23444, maxhomop=8)

summary.seqs(count=current)

filter.seqs(fasta=current, vertical=T, trump=.)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.good.filter.fasta, count=current)

unique.seqs(fasta=current, count=current)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.good.filter.unique.fasta, count=current)

pre.cluster(fasta=current, count=current, diffs=2)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.good.filter.unique.precluster.fasta, count=current)

chimera.uchime(fasta=current, count=current, dereplicate=t)

remove.seqs(fasta=current, accnos=current)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.good.filter.unique.precluster.pick.fasta, count=current)

classify.seqs(fasta=current, count=current, reference=trainset9_032012.pds.fasta, taxonomy=trainset9_032012.pds.tax, cutoff=80)

remove.lineage(fasta=current, count=current, taxonomy=current, taxon=Chloroplast-Mitochondria-unknown-Archaea-Eukaryota)

summary.seqs(fasta=BESB.trim.contigs.trim.good.unique.good.filter.unique.precluster.pick.pick.fasta, count=current)

system(copy BESB.trim.contigs.trim.good.unique.good.filter.unique.precluster.pick.pick.fasta BESB.final.fasta)

system(copy BESB.contigs.pick.good.groups BESB.final.groups)

system(copy BESB.trim.contigs.trim.good.names BESB.final.names)

system(copy BESB.trim.contigs.trim.good.unique.good.filter.unique.precluster.pick.pds.wang.pick.taxonomy BESB.final.taxonomy)

system(copy BESB.trim.contigs.trim.good.unique.good.filter.unique.precluster.uchime.pick.pick.count_table BESB.final.count_table)

Normalization step: rarefy data down by subsampling to minimum number of sequences from count.groups (in this case, 69606)

count.groups(count=BESB.final.count_table)

sub.sample(fasta=BESB.final.fasta, count=BESB.final.count_table, taxonomy=BESB.final.taxonomy, persample=true, size=69606)

count.groups(count=BESB.final.subsample.count_table)

cluster.split(fasta=BESB.final.subsample.fasta, count=BESB.final.subsample.count_table, taxonomy=BESB.final.subsample.taxonomy, splitmethod=classify, taxlevel=4, cutoff=0.15)

make.shared(list=BESB.final.subsample.an.unique_list.list, count=BESB.final.subsample.count_table, label=0.03)

classify.otu(list=current, count=current, taxonomy=BESB.final.subsample.taxonomy, label=0.03)

phylotype(taxonomy=BESB.final.subsample.taxonomy)

make.shared(list=BESB.final.subsample.tx.list, count=BESB.final.subsample.count_table, label=1)

classify.otu(list=BESB.final.subsample.tx.list, count=BESB.final.subsample.count_table,taxonomy=BESB.final.subsample.taxonomy, label=1)

summary.tax(taxonomy=current, count=current)

People and Organizations

Publishers:

Organization:

Environmental Data Initiative

Email Address:

info@environmentaldatainitiative.org

Web Address:

https://environmentaldatainitiative.org

Creators:

Individual:

Sylvia Lee

Organization:

Cary Institute Of Ecosystem Studies

Email Address:

lees@caryinstitute.org

Individual:

Emma Rosi

Organization:

Cary Institute Of Ecosystem Studies

Email Address:

rosie@caryinstitute.org

Id:

https://orcid.org/0000-0002-3476-6368

Individual:

John Kelly

Email Address:

jkelly7@luc.edu

Contacts:

Individual:

Baltimore Ecosystem Study Information Manager

Organization:

Cary Institute Of Ecosystem Studies

Email Address:

besim@caryinstitute.org

Id:

https://orcid.org/0000-0002-3476-6368

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:

Time Period

Begin:

2014-06-18

End:

2014-10-21

Geographic Region:

Description:

Baltimore County

Bounding Coordinates:

Northern:	39.722	Southern:	39.19
Western:	-76.93	Eastern:	-76.33

Geographic Region:

Description:

Baltimore City

Bounding Coordinates:

Northern:	39.373	Southern:	39.196
Western:	-76.712	Eastern:	-76.528

Project

Parent Project Information:

Title:

Urban LTER: Human Settlements as Ecosystems: Metropolitan Baltimore from 1797-2100 (1997-2005)

Personnel:

Individual:

Steward Pickett

Organization:

Cary Institute of Ecosystem Studies

Email Address:

picketts@caryinstitute.org

Id:

https://orcid.org/0000-0002-1899-976x

Role:

Principal Investigator

Funding:

NSF 9714835

Related Project:

Title:

LTER: Human Settlements as Ecosystems: Metropolitan Baltimore from 1797-2100: Phase II (2005-2011)

Personnel:

Individual:

Steward Pickett

Organization:

Cary Institute of Ecosystem Studies

Email Address:

picketts@caryinstitute.org

Id:

https://orcid.org/0000-0002-1899-976x

Role:

Principal Investigator

Funding:

NSF 423476

Related Project:

Title:

LTER: Baltimore Ecosystem Study Phase III: Adaptive Processes in the Baltimore Socio-Ecological System from the Sanitary to the Sustainable City (2011-2018)

Personnel:

Individual:

Steward Pickett

Organization:

Cary Institute of Ecosystem Studies

Email Address:

picketts@caryinstitute.org

Id:

https://orcid.org/0000-0002-1899-976x

Role:

Principal Investigator

Funding:

NSF 1027188

Related Project:

Title:

LTER: Dynamic heterogeneity: Investigating causes and consequences of ecological change in the Baltimore urban ecosystem (2017-2021)

Personnel:

Individual:

Steward Pickett

Organization:

Cary Institute of Ecosystem Studies

Email Address:

picketts@caryinstitute.org

Id:

https://orcid.org/0000-0002-1899-976x

Role:

Principal Investigator

Funding:

NSF 1637661

Maintenance

Maintenance:

Description:	complete
Frequency:

Other Metadata

Copyright 2024 Environmental Data Initiative. This material is based upon work supported by the National Science Foundation under grants #2223103 and #2223104. Any opinions, findings, conclusions, or recommendations expressed in the material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Please contact us with questions, comments, or for technical assistance regarding this web site or the Environmental Data Initiative. Please read our privacy policy to know what information we collect about you and to understand your privacy rights.

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

Data Package Metadata View Summary

Baltimore Ecosystem Study: Stream biofilm bacterial community composition

Data Entities

Data Table

Data Package Usage Rights

Keywords

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

People and Organizations

Temporal, Geographic and Taxonomic Coverage

Project

Parent Project Information:

Maintenance

Recently Added

Recently Updated

Data Package Metadata View Summary

Baltimore Ecosystem Study: Stream biofilm bacterial community composition

+/- Data Entities

Data Table

+/- Data Package Usage Rights

+/- Keywords

+/- Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

+/- People and Organizations

+/- Temporal, Geographic and Taxonomic Coverage

+/- Project

Parent Project Information:

+/- Maintenance

Data Entities

Data Package Usage Rights

Keywords

Methods and Protocols

People and Organizations

Temporal, Geographic and Taxonomic Coverage

Project

Maintenance