Data Package Metadata   View Summary

Systematic review and meta-analysis of near-term ecological forecasting literature, including forecast performance data. Search conducted May 2020

General Information
Data Package:
Local Identifier:edi.187.1
Title:Systematic review and meta-analysis of near-term ecological forecasting literature, including forecast performance data. Search conducted May 2020
Alternate Identifier:DOI PLACE HOLDER
Abstract:

This data publication includes results and code from a systematic review and meta-analysis of near-term ecological forecasting literature. The study had two primary goals: (1) analyze the state of near-term ecological forecasting literature, and (2) compare forecast skill across ecosystems and variables. We began by conducting a Web of Science search for “forecast*” in the title, abstract, and keywords of all papers published in ecological journals, then screened all papers from this search to identify near-term ecological forecasts. To more broadly survey the literature, we then searched all papers that cite or are cited by the near-term ecological forecasts we identified. We performed an in-depth review of all near-term ecological forecasting papers identified through this search process, and recorded forecast skill data for all papers that reported R or R2. Our results indicate that the rate of publication of near-term ecological forecasts is increasing over time and the field is becoming increasingly open and automated. Across published forecasts, we find that forecast skill decreases in predictable patterns and these patterns differ between forecast variables. This data publication includes three products from this analysis: (1) a database of all papers identified in the two searches, including our assessment of whether they included an ecological focal variable, included a forecast, and whether the forecast was near-term (≤10 years), (2) a matrix of all data collected on the near-term ecological forecasts we identified, and (3) a database of R2 values for papers that reported R or R2.

Publication Date:2021-03-15

Time Period
Begin:
1932-11-01
End:
2020-05-18

People and Organizations
Contact:Lewis, Abigail S. L. (Virginia Tech) [  email ]
Creator:Lewis, Abigail S. L. (Virginia Tech)
Creator:Woelmer, Whitney M. (Virginia Tech)
Creator:Wander, Heather L. (Virginia Tech)
Creator:Howard, Dexter W. (Virginia Tech)
Creator:Smith, John (Virginia Tech)
Creator:McClure, Ryan P. (Virginia Tech)
Creator:Lofton, Mary E. (Virginia Tech)
Creator:Hammond, Nicholas W. (Virginia Tech)
Creator:Corrigan, Rachel (Virginia Tech)
Creator:Thomas, R. Quinn (Virginia Tech)
Creator:Carey, Cayelan C. (Virginia Tech)

Data Entities
Data Table Name:
Dataset 1: Abstract review
Description:
Dataset 1: Abstract review
Data Table Name:
Dataset 3: R2 values
Description:
Dataset 3: R2 values
Other Name:
General data analysis
Description:
General data analysis
Other Name:
R2 analysis
Description:
R2 analysis
Detailed Metadata

Data Entities


Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/187/1/842ecb2a5c35c5b1a75227a7ec96be45
Name:Dataset 1: Abstract review
Description:Dataset 1: Abstract review
Number of Records:3183
Number of Columns:8

Table Structure
Object Name:initial_review_all_papers_EDI.csv
Size:663983 bytes
Authentication:4e6a396061a8049c01359536fcd30265 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\r\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:Title  
doi  
Authors  
Year  
Round  
NearTerm  
Ecological  
Forecast  
Definition:Paper titledoi from Web of ScienceAuthor listYear of publicationWe performed our literature search process in two steps. Round 1 = Web of Science search, round 2 = search of citing and cited papersPaper predicts ² 10 years into the future from the forecast date. 1 = yes, 0 = no. Marked NA when the paper does not include a forecastFocal variable is ecological (organismal or biogeochemical). 1 = yes, 0 = noPaper includes at least one prediction of future conditions from the perspective of the model; forecasts could be developed retroactively (i.e., ÒhindcastsÓ) but could only use driver data that was available before the forecast date (e.g., forecasted or time-lagged driver variables). 1 = yes, 0 = no
Storage Type:string  
string  
string  
float  
float  
float  
float  
float  
Measurement Type:nominalnominalnominalratioratioratioratioratio
Measurement Values Domain:
DefinitionPaper title
Definitiondoi from Web of Science
DefinitionAuthor list
UnitnominalYear
Typenatural
Min1932 
Max2020 
Unitdimensionless
Typenatural
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Missing Value Code:
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
Accuracy Report:                
Accuracy Assessment:                
Coverage:                
Methods:                

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/187/1/d69f677e0810818d17d1d4b807c86fdf
Name:Dataset 3: R2 values
Description:Dataset 3: R2 values
Number of Records:975
Number of Columns:10

Table Structure
Object Name:R2_dataset_for_EDI.csv
Size:270204 bytes
Authentication:c30aab747ed1ba08899dfdea79f5dfe9 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\r\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:Title  
doi  
Authors  
Year  
Horiz_days  
Var  
Notes  
R2_value  
Model_group  
Site_or_year_group  
Definition:Paper titledoi from Web of ScienceAuthor listYear of publicationNumber of days into the future forecastedVariable forecastedNotes about the R2 values reportedValue of R2Integer values to separate different models within a paperInteger values to identify forecasts at distinct sites or for distinct years within one paper
Storage Type:string  
string  
string  
float  
float  
string  
string  
float  
float  
float  
Measurement Type:nominalnominalnominalratiorationominalnominalratioratioratio
Measurement Values Domain:
DefinitionPaper title
Definitiondoi from Web of Science
DefinitionAuthor list
UnitnominalYear
Typenatural
Min1986 
Max2019 
Unitdimensionless
Typereal
Min
Max5475 
DefinitionVariable forecasted
DefinitionNotes about the R2 values reported
Unitdimensionless
Typereal
Min
Max
Unitdimensionless
Typenatural
Min
Max105 
Unitdimensionless
Typenatural
Min
Max
Missing Value Code:
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
Accuracy Report:                    
Accuracy Assessment:                    
Coverage:                    
Methods:                    

Non-Categorized Data Resource

Name:General data analysis
Entity Type:unknown
Description:General data analysis
Physical Structure Description:
Object Name:Final import and analysis - for EDI.Rmd
Size:24803 bytes
Authentication:bdf1f0aba50b59e9be9b9b75cd5d6edb Calculated By MD5
Externally Defined Format:
Format Name:text/x-markdown
Data:https://pasta-s.lternet.edu/package/data/eml/edi/187/1/216ee52714803afb8b0e2f3c773d554e

Non-Categorized Data Resource

Name:R2 analysis
Entity Type:unknown
Description:R2 analysis
Physical Structure Description:
Object Name:R2_analysis_EDI.Rmd
Size:9306 bytes
Authentication:5152e4d7ebc85750218d82bd1a9a6b59 Calculated By MD5
Externally Defined Format:
Format Name:text/x-markdown
Data:https://pasta-s.lternet.edu/package/data/eml/edi/187/1/3c29d650c9da6cf17d72a7284c1fb978

Data Package Usage Rights

This information is released under the Creative Commons license - Attribution - CC BY (https://creativecommons.org/licenses/by/4.0/). The consumer of these data (\"Data User\" herein) is required to cite it appropriately in any publication that results from its use. The Data User should realize that these data may be actively used by others for ongoing research and that coordination may be necessary to prevent duplicate publication. The Data User is urged to contact the authors of these data if any questions about methodology or results occur. Where appropriate, the Data User is encouraged to consider collaboration or co-authorship with the authors. The Data User should realize that misinterpretation of data may occur if used out of context of the original study. While substantial efforts are made to ensure the accuracy of data and associated documentation, complete accuracy of data sets cannot be guaranteed. All data are made available \"as is.\" The Data User should be aware, however, that data are updated periodically and it is the responsibility of the Data User to check for new versions of the data. The data authors and the repository where these data were obtained shall not be liable for damages resulting from any use or misinterpretation of the data. Thank you.

Keywords

By Thesaurus:
(No thesaurus)Near-term ecological forecasts, best practices, forecastability, ecological predictability, forecast uncertainty, forecast skill, open science, null model, decision support, systematic review, meta-analysis

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package
Description:

We systematically reviewed literature on near-term ecological forecasting to both determine how best practices have been implemented over time and compare forecastability across scales and variables. First, we used Web of Science searches and abstract review to identify papers that report near-term ecological forecasts (described in Literature search below). Two reviewers then independently read and analyzed each selected paper using a standardized matrix of criteria (Matrix analysis) and we recorded forecast skill when reported (Forecast skill).

Literature search (Dataset 1) We began by querying Web of Science Core Collection [v.5.34] for “forecast*” in the title, abstract, or keywords of papers published in 301 ecological journals, then manually screened abstracts of all resulting papers. We conducted the Web of Science search on 18 May 2020 and limited the search to articles and proceedings papers (hereafter, ‘papers’) published in English. This yielded 2711 results.

We screened the abstracts of all 2711 papers and selected those that met three criteria: (1) Papers had to include at least one forecast, which we defined as a prediction of future conditions from the perspective of the model; forecasts could be developed retroactively (i.e., “hindcasts”) but could only use driver data that was available before the forecast date (e.g., forecasted or time-lagged driver variables). (2) The forecast had to be near-term, which we defined as predicting ≤ 10 years into the future. (3)The forecast had to be ecological, which we defined as predicting a biogeochemical, population, or community response variable. This definition therefore excludes physical (e.g., streamflow or water temperature) and meteorological forecasts. Forecasts of human disease were only included if there was an animal vector. If the abstract indicated that the paper met all three criteria, it was moved to a second round of screening. Here, a second reviewer read the full paper to ensure that at least one forecast in the paper met all three criteria. Through this screening process, we identified 142 near-term ecological forecasting papers out of the 2711 Web of Science results.

Because ecological forecasts may be published in journals that are not categorized as “ecological” by Web of Science, we then searched all papers that were cited by the near-term ecological forecast papers we identified, as well as all papers that cited these studies. We selected those that were published in English and included “forecast*” in the title, abstract, or keywords, then screened the abstracts to ensure they met our three criteria. Finally, we read the papers themselves for confirmation.

Matrix analysis (Dataset 2) We analyzed each of the 252 papers selected in our systematic search using a standardized matrix of questions (SI Table 1). This matrix was co-developed over several months of iteration and discussion by all authors within an Ecological Forecasting graduate seminar at Virginia Tech (January–May 2020). The final matrix used for this study included 65 questions about the model, evaluation, cyberinfrastructure, archiving, and decision support (SI Table 1). Throughout the graduate seminar, we read and analyzed 10 papers as a group, ensuring that all reviewers understood how to interpret and answer questions in a consistent manner. Reviewers also screened several papers individually and checked their responses with another reviewer prior to the start of this analysis, helping to ensure consistency between reviewers. For the analysis described in this paper, all 252 papers were read and analyzed independently by two reviewers, and reviewers then compared any differing answers to reach consensus on a final set of responses for each paper.

During the intensive matrix analysis, 74 papers were determined to not meet our criteria of being near-term ecological forecasts, despite having passed the initial rounds of screening—these papers typically used one or more data sources that would not have been possible to know before the forecast date, and were difficult to identify without reading the entire paper in detail. These papers were excluded from the analysis, leaving 178 papers in the final dataset.

Forecast skill (Dataset 3) We gathered all Pearson's r and R2 data reported in papers in the dataset. Pearson’s r values were squared to yield R2

People and Organizations

Publishers:
Organization:Environmental Data Initiative
Email Address:
info@environmentaldatainitiative.org
Web Address:
https://environmentaldatainitiative.org
Creators:
Individual: Abigail S. L. Lewis
Organization:Virginia Tech
Email Address:
aslewis@vt.edu
Id:https://orcid.org/0000-0001-9933-4542
Individual: Whitney M. Woelmer
Organization:Virginia Tech
Email Address:
wwoelmer@vt.edu
Id:https://orcid.org/0000-0001-5147-3877
Individual: Heather L. Wander
Organization:Virginia Tech
Email Address:
hwander@vt.edu
Individual: Dexter W. Howard
Organization:Virginia Tech
Email Address:
dwh1998@vt.edu
Individual: John Smith
Organization:Virginia Tech
Email Address:
wsjohn2@vt.edu
Individual: Ryan P. McClure
Organization:Virginia Tech
Email Address:
Ryan333@vt.edu
Id:https://orcid.org/0000-0001-6370-3852
Individual: Mary E. Lofton
Organization:Virginia Tech
Email Address:
melofton@vt.edu
Id:https://orcid.org/0000-0003-3270-1330
Individual: Nicholas W. Hammond
Organization:Virginia Tech
Email Address:
hammondnw@vt.edu
Individual: Rachel Corrigan
Organization:Virginia Tech
Email Address:
rachelc1@vt.edu
Individual: R. Quinn Thomas
Organization:Virginia Tech
Email Address:
rqthomas@vt.edu
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Contacts:
Individual: Abigail S. L. Lewis
Organization:Virginia Tech
Email Address:
aslewis@vt.edu
Id:https://orcid.org/0000-0001-9933-4542

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:

Time Period
Begin:
1932-11-01
End:
2020-05-18

Project

Parent Project Information:

Title:Collaborative Research: Consequences of changing oxygen availability for carbon cycling in freshwater ecosystems
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1753639
Related Project:
Title:SCC-IRG Track 2: Resilient Water Systems: Integrating Environmental Sensor Networks and Real-Time Forecasting to Adaptively Manage Drinking Water Quality and Build Social Trust
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1737424
Related Project:
Title:MSA: Macrosystems EDDIE: An undergraduate training program in macrosystems science and ecological forecasting
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1926050
Related Project:
Title:Collaborative Research: CIBR: Cyberinfrastructure Enabling End-to-End Workflows for Aquatic Ecosystem Forecasting
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1933016
Related Project:
Title:Collaborative Research: CIBR: Cyberinfrastructure Enabling End-to-End Workflows for Aquatic Ecosystem Forecasting
Personnel:
Individual: Renato Figueiredo
Organization:University of Florida
Email Address:
renato@acis.ufl.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1933102
Related Project:
Title:Graduate Research Fellowship
Personnel:
Individual: Abigail S. L. Lewis
Organization:Virginia Tech
Email Address:
aslewis@vt.edu
Role:Principal Investigator
Funding: National Science Foundation DGE-1651272
Related Project:
Title:Graduate Research Fellowship
Personnel:
Individual: Whitney M. Woelmer
Organization:Virginia Tech
Email Address:
wwoelmer@vt.edu
Role:Principal Investigator
Funding: National Science Foundation DGE-1651272

Maintenance

Maintenance:
Description:complete
Frequency:
Other Metadata

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

UNM logo UW-M logo