Data Package Metadata   View Summary

Systematic review of near-term ecological forecasting literature published between 1932 and 2020

General Information
Data Package:
Local Identifier:edi.368.1
Title:Systematic review of near-term ecological forecasting literature published between 1932 and 2020
Alternate Identifier:DOI PLACE HOLDER
Abstract:

This data publication includes results and code from a systematic review of near-term ecological forecasting literature. The study had two primary goals: (1) analyze the state of near-term ecological forecasting literature, and (2) compare forecast skill across ecosystems and variables. We began by conducting a Web of Science search for “forecast*” in the title, abstract, and keywords of all papers published in ecological journals, then screened all papers from this search to identify near-term ecological forecasts. We defined a near-term ecological forecast as future predictions of community, population, or biogeochemical variables ≤ 10 years from the forecast date. To more broadly survey the literature, we then searched all papers that cited or were cited by the near-term ecological forecasts we identified. We performed an in-depth review of all near-term ecological forecasting papers identified through this search process, and recorded forecast skill data for all papers that reported R or R2. Our results indicate that the rate of publication of near-term ecological forecasts is increasing over time and the field is becoming increasingly open and automated. Across published forecasts, we find that forecast skill decreases in predictable patterns and these patterns differ between forecast variables. This data publication includes three products from this analysis: (1) a database of all papers identified in the two searches, including our assessment of whether they included an ecological focal variable, included a forecast, and whether the forecast was near-term (≤10 years), (2) a matrix of all data collected on the near-term ecological forecasts we identified, and (3) a database of R2 values for papers that reported R or R2.

Publication Date:2021-07-13

Time Period
Begin:
1932-11-01
End:
2020-05-18

People and Organizations
Contact:Lewis, Abigail S. L. (Virginia Tech) [  email ]
Creator:Lewis, Abigail S. L. (Virginia Tech)
Creator:Woelmer, Whitney M. (Virginia Tech)
Creator:Wander, Heather L. (Virginia Tech)
Creator:Howard, Dexter W. (Virginia Tech)
Creator:Smith, John W. (Virginia Tech)
Creator:McClure, Ryan P. (Virginia Tech)
Creator:Lofton, Mary E. (Virginia Tech)
Creator:Hammond, Nicholas W. (Virginia Tech)
Creator:Corrigan, Rachel S. (Virginia Tech)
Creator:Thomas, R. Quinn (Virginia Tech)
Creator:Carey, Cayelan C. (Virginia Tech)

Data Entities
Data Table Name:
Dataset 1: Abstract review
Description:
Dataset 1: Abstract review
Data Table Name:
Dataset 2: Matrix review
Description:
Dataset 2: Matrix review
Data Table Name:
Dataset 3: R2 values
Description:
Dataset 3: R2 values
Other Name:
General data analysis
Description:
General data analysis
Other Name:
R2 analysis
Description:
R2 analysis
Other Name:
Abstract review
Description:
Abstract review
Detailed Metadata

Data Entities


Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/368/1/842ecb2a5c35c5b1a75227a7ec96be45
Name:Dataset 1: Abstract review
Description:Dataset 1: Abstract review
Number of Records:3183
Number of Columns:8

Table Structure
Object Name:initial_review_all_papers_EDI.csv
Size:660799 bytes
Authentication:896ef19f8f0cd705ce758fd15dcc676d Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:Title  
doi  
Authors  
Year  
Round  
NearTerm  
Ecological  
Forecast  
Definition:Paper titleDigital object identifier (doi) from Web of ScienceAuthor listYear of publicationWe performed our literature search process in two steps. Round 1 = Web of Science search, round 2 = search of citing and cited papersPaper predicts less than or equal to 10 years into the future from the forecast date. Marked NA when the paper does not include a forecast. 1 = near term, 0 = not near termFocal variable is ecological (organismal or biogeochemical). 1 = ecological, 0 = not ecologicalPaper includes at least one prediction of future conditions from the perspective of the model; forecasts could be developed retroactively (i.e., hindcasts) but could only use driver data that were available before the forecast date (e.g., forecasted or time-lagged driver variables). 1 = a forecast, 0 = not a forecast
Storage Type:string  
string  
string  
float  
float  
float  
float  
float  
Measurement Type:nominalnominalnominalratioratioratioratioratio
Measurement Values Domain:
DefinitionPaper title
DefinitionDigital object identifier (doi) from Web of Science
DefinitionAuthor list
UnitnominalYear
Typenatural
Min1932 
Max2020 
Unitdimensionless
Typenatural
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Missing Value Code:
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
Accuracy Report:                
Accuracy Assessment:                
Coverage:                
Methods:                

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/368/1/300731a015dd3147aec7738a66258bb3
Name:Dataset 2: Matrix review
Description:Dataset 2: Matrix review
Number of Records:178
Number of Columns:58

Table Structure
Object Name:complete_dataset_with_source.csv
Size:96810 bytes
Authentication:ab1dfa3af45e74a017ac2f043e6940f3 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\r\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:Title  
doi  
Authors  
Year  
Source  
Spat_scale  
Coords  
Ecosystem  
Class  
Vars_ident  
Model_dim  
Model_type  
Model_desc  
Met_covar_yn  
Phys_covar_yn  
Bio_covar_yn  
Chem_covar_yn  
Ens_within_model_yn  
Ens_members_n  
Ens_of_models_yn  
Models_n  
Multiple_approaches_yn  
Approach_n  
Null_yn  
Null_n  
Null_type  
Horiz_days  
Time_step_days  
Iterative_yn  
DA_type  
Uncert_category  
Uncert_source  
Uncert_obs_yn  
Uncert_partition_yn  
Uncert_ic_yn  
Uncert_driver_yn  
Uncert_param_yn  
Uncert_process_yn  
Uncert_other_yn  
Uncert_dom  
Uncert_describe  
Forecast_eval  
Forecast_eval_shown  
Eval_metrics  
Eval_mult_horiz_yn  
Forecast_horiz_null_days  
Data_coverage  
Automated_yn  
Archiving_yn  
Repository  
Link_works  
Drivers_published  
End_user_yn  
End_user_partnership  
Used_by_stakeholder_yn  
Delivery_method_yn  
Delivery_method  
Ethical_considerations  
Definition:Paper titleDigital object identifier (doi) from Web of ScienceAuthor listYear of publicationJournal or conference in which the paper was publishedForecast spatial scale, classified into five categoriesGeographic coordinates of the forecast site using decimal degrees. For papers with multiple locations, locations are separated using a semicolon. Locations for regional and national forecasts are approximately the center of the forecast areaForecast ecosystem: forest, grassland, freshwater, marine, desert, tundra, atmosphere, agricultural, urban, global, otherForecast class: biogeochemical or organismal (which includes population or community)Identity of forecast variablesModel dimension: 0D, 1D, 2D, 3DModel type: empirical (dependent on correlative or statistical relationships) or process-based (explicitly simulating ecological processes). For forecasting workflows that involve a pipeline of multiple models, this refers to the final model that forecasts the forecast variable of interestIf specified: more detailed description of model: for example, Bayesian hierarchical, machine learning, named model (e.g., PROTECH), etc.Are meteorological covariates used in this forecast? 1 = yes, 0 = noAre physical covariates (e.g., streamflow) used in this forecast? 1 = yes, 0 = noAre biological covariates used in this forecast? 1 = yes, 0 = noAre chemical covariates used in this forecast? 1 = yes, 0 = noDoes the paper include an ensemble forecast (ensemble within model)? 1 = yes, 0 = noNumber of ensemble membersDoes the paper use an ensemble of models to produce one output? 1 = yes, 0 = noHow many models in the ensemble modelAre multiple models with different model structures compared (NOT including null models)? 1 = yes, 0 = noHow many models with different structures are compared?Was a forecast null model (persistence or climatology) included? 1 = yes, 0 = noHow many null models?What type of null model (climatology or persistence)?Maximum time into the future that the forecast predicts in this paper, described in daysTime step of forecast output. For example, a forecast that gives predictions for the next 16 days but was only run once a week would have a time step of one day (not one week)Are the forecasts described in the papers iterative (i.e., data updating forecasts iteratively)? Any form of iteration counts here: updating initial conditions with new data, refitting the model to incorporate new dta, updating parameter values, etc. State updating via the autoregressive term counts as data assimilation for autoregressive modelsWhat technique of data assimilation was used? For example, KF, enKF, refit, update IC, etc.Extent to which uncertainty is included in the forecast, classified within 5 categories: no (this model does not contain uncertainty), contains (the model contains uncertainty, but uncertainty is not derived from data; e.g. uncertainty comes from spin-up initial conditions or hand-tuned parameters), data_driven (the model contains data-driven uncertainty; e.g. uncertainty in meteorological drivers), propagates (the model propagates some source of uncertainty), assimilates (the model iteratively updates uncertainty through data assimilation). NOTE: this is assumed to be a hierarchy (e.g. if the forecast contains data driven uncertainty and propagates that uncertainty, it would be marked propagates)What sources of uncertainty were incorporated?Was observation uncertainty included? 1 = yes, 0 = noAre at least two different sources of uncertainty quantified and compared? 1 = yes, 0 = no. NOTE: the two sources may be in the same category of uncertainty (e.g. two forms of driver data)Initial condition uncertainty partitioned? 1 = yes, 0 = noDriver uncertainty partitioned? 1 = yes, 0 = noParameter uncertainty partitioned? 1 = yes, 0 = noProcess uncertainty partitioned? 1 = yes, 0 = noOther partitioned sources of uncertainty? 1 = yes, 0 = noIf at least two categories of uncertainty were partitioned, what was the dominant source of uncertainty?If the dominant source varies by forecast horizon, season, etc. please describe herePaper states that forecast was evaluated? 1 = yes, 0 = noForecast evaluation results reported in paper? 1 = yes, 0 = noList all skill metrics used (e.g. R2, RMSE, bias, MAE). SD and Bayesian credible intervals are not skill metricsIs forecast performance assessed at multiple forecast horizons (results must be reported in paper/supplemental info)? 1 = yes, 0 = noMaximum forecast horizon such that the forecast was better than the null model (out of any models used)Temporal coverage of data used to create this forecasting paperWas new data (driver and/or observations) available to the model in real time (<24 hours from collection) without any manual effort when the system was working as intended? 1 = yes, 0 = noForecast archiving described in text? 1 = yes, 0 = noRepository in which forecasts are archivedArchiving website is still accessible via the link in the paper as of 14 Jun 2021? 1 = yes, 0 = noText specifies that driver data are publicly available to reproduce the forecasts? 1 = yes, 0 = noSpecific end user identified (proper noun)? 1 = yes, 0 = noPartnership with the end user in forecast development mentioned in paper? 1 = yes, 0 = noForecast being used by the end user according to paper? 1 = yes, 0 = noForecast delivery method identified? 1 = yes, 0 = noForecast delivery method?Any ethical considerations mentioned? 1 = yes, 0 = no
Storage Type:string  
string  
string  
float  
string  
string  
string  
string  
string  
string  
string  
string  
string  
float  
float  
float  
float  
float  
string  
float  
float  
float  
float  
float  
float  
string  
string  
string  
float  
string  
string  
string  
float  
float  
float  
float  
float  
float  
float  
string  
string  
float  
float  
string  
float  
float  
float  
string  
float  
string  
float  
float  
float  
float  
float  
float  
string  
float  
Measurement Type:nominalnominalnominalrationominalnominalnominalnominalnominalnominalnominalnominalnominalratioratioratioratiorationominalratioratioratioratioratiorationominalnominalnominalrationominalnominalnominalratioratioratioratioratioratiorationominalnominalratiorationominalratioratiorationominalrationominalratioratioratioratioratiorationominalratio
Measurement Values Domain:
DefinitionPaper title
DefinitionDigital object identifier (doi) from Web of Science
DefinitionAuthor list
UnitnominalYear
Typenatural
Min1932 
Max2020 
DefinitionJournal or conference in which the paper was published
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Codeglobal
Definitione.g. coral bleaching stress in world oceans
Source
Code Definition
Codemultipoint
Definitionseveral distinct forecast locations, such as three different lakes
Source
Code Definition
Codenational
Definitionspanning all of one nation, such as nationwide production of an agricultural crop
Source
Code Definition
Codepoint
Definitionlocalized to one discrete site, such as pollen forecasts for a city or algal forecasts for a lake
Source
Code Definition
Coderegional
Definitionlocalized to a broad geographic region, such as coral bleaching forecasts that span a sea
Source
DefinitionGeographic coordinates of the forecast site using decimal degrees. For papers with multiple locations, locations are separated using a semicolon. Locations for regional and national forecasts are approximately the center of the forecast area
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Codeagricultural
DefinitionAgricultural forecast ecosystem
Source
Code Definition
Codeatmosphere
DefinitionAtmospheric forecast ecosystem
Source
Code Definition
Codedesert
DefinitionDesert forecast ecosystem
Source
Code Definition
Codeforest
DefinitionForest forecast ecosystem
Source
Code Definition
Codefreshwater
DefinitionFreshwater forecast ecosystem
Source
Code Definition
Codegrassland
DefinitionGrassland forecast ecosystem
Source
Code Definition
Codemarine
DefinitionMarine forecast ecosystem
Source
Code Definition
Codeother
DefinitionForecast ecosystem does not fit within the other predefined ecosystem types (e.g. a forecast for bird migration across North America)
Source
Code Definition
Codetundra
DefinitionTundra forecast ecosystem
Source
Code Definition
Codeurban
DefinitionUrban forecast ecosystem
Source
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Codebiogeochemical
DefinitionForecast variable is biogeochemical, but not organismal
Source
Code Definition
Codeboth
Definitionboth organismal and biogeochemical forecasts presented in text
Source
Code Definition
Codeorganismal
Definitionforecast varaible relates to a population or community of organisms
Source
DefinitionIdentity of forecast variables
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Code0D
DefinitionZero dimensional model
Source
Code Definition
Code1D
DefinitionOne dimentsional model
Source
Code Definition
Code2D
DefinitionTwo dimensional model
Source
Code Definition
Code3D
DefinitionThree dimensional model
Source
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Codeboth
DefinitionBoth empirical and process-based models are used
Source
Code Definition
Codeempirical
DefinitionFinal model used to generate forecasts is empirical
Source
Code Definition
Codeprocess-based
DefinitionFinal model used to generate forecasts is process-based
Source
DefinitionIf specified: more detailed description of model: for example, Bayesian hierarchical, machine learning, named model (e.g., PROTECH), etc.
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
DefinitionNumber of ensemble members
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typenatural
Min
Max10 
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max49 
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
DefinitionWhat type of null model (climatology or persistence)?
DefinitionMaximum time into the future that the forecast predicts in this paper, described in days
DefinitionTime step of forecast output. For example, a forecast that gives predictions for the next 16 days but was only run once a week would have a time step of one day (not one week)
Unitdimensionless
Typewhole
Min
Max
DefinitionWhat technique of data assimilation was used? For example, KF, enKF, refit, update IC, etc.
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Codeassimilates
Definitionthe model iteratively updates uncertainty through data assimilation
Source
Code Definition
Codecontains
Definitionthe model contains uncertainty, but uncertainty is not derived from data; e.g. uncertainty comes from spin-up initial conditions or hand-tuned parameters
Source
Code Definition
Codedata_driven
Definitionthe model contains data-driven uncertainty; e.g. uncertainty in meteorological drivers
Source
Code Definition
Codeno
Definitionthis model does not contain uncertainty
Source
Code Definition
Codepropagates
Definitionthe model propagates some source of uncertainty
Source
DefinitionWhat sources of uncertainty were incorporated?
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
DefinitionIf at least two categories of uncertainty were partitioned, what was the dominant source of uncertainty?
DefinitionIf the dominant source varies by forecast horizon, season, etc. please describe here
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
DefinitionList all skill metrics used (e.g. R2, RMSE, bias, MAE). SD and Bayesian credible intervals are not skill metrics
Unitdimensionless
Typewhole
Min
Max
UnitnominalDay
Typewhole
Min
Max1865 
UnitnominalDay
Typenatural
Min17 
Max52925 
Allowed Values and Definitions
Enumerated Domain 
Code Definition
CodeAt least one data stream
DefinitionAt least one stream of data used to make forecasts is available to the model within 24 hours when the system is working as intended
Source
Code Definition
CodeNo data streams
DefinitionNo data streams are available to the model within 24 hours even when the system is working as intended
Source
Code Definition
CodeUNK
Definitionunknown (not specified in text)
Source
Code Definition
CodeYes all data streams
DefinitionAll data streams used to make forecasts are available to the model within 24 hours when the system is working as intended
Source
Unitdimensionless
Typewhole
Min
Max
DefinitionRepository in which forecasts are archived
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
Unitdimensionless
Typewhole
Min
Max
DefinitionForecast delivery method?
Unitdimensionless
Typewhole
Min
Max
Missing Value Code:
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
Accuracy Report:                                                                                                                    
Accuracy Assessment:                                                                                                                    
Coverage:                                                                                                                    
Methods:                                                                                                                    

Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/368/1/d69f677e0810818d17d1d4b807c86fdf
Name:Dataset 3: R2 values
Description:Dataset 3: R2 values
Number of Records:975
Number of Columns:10

Table Structure
Object Name:R2_dataset_for_EDI.csv
Size:261112 bytes
Authentication:f607d26c423017a6552639d976e5f88d Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\r\n
Orientation:column
Simple Delimited:
Field Delimiter:,
Quote Character:"

Table Column Descriptions
 
Column Name:Title  
doi  
Authors  
Year  
Horiz_days  
Var  
Notes  
R2_value  
Model_group  
Site_or_year_group  
Definition:Paper titleDigital object identifier (doi) from Web of ScienceAuthor listYear of publicationNumber of days into the future forecastedVariable forecastedNotes about the R2 values reportedValue of R2Integer values to separate different models within a paperInteger values to identify forecasts at distinct sites or for distinct years within one paper
Storage Type:string  
string  
string  
float  
float  
string  
string  
float  
float  
float  
Measurement Type:nominalnominalnominalratiorationominalnominalratioratioratio
Measurement Values Domain:
DefinitionPaper title
DefinitionDigital object identifier (doi) from Web of Science
DefinitionAuthor list
UnitnominalYear
Typenatural
Min1986 
Max2019 
Unitdimensionless
Typereal
Min
Max5475 
DefinitionVariable forecasted
DefinitionNotes about the R2 values reported
Unitdimensionless
Typereal
Min
Max
Unitdimensionless
Typenatural
Min
Max105 
Unitdimensionless
Typenatural
Min
Max
Missing Value Code:
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
CodeNA
ExplNo data
Accuracy Report:                    
Accuracy Assessment:                    
Coverage:                    
Methods:                    

Non-Categorized Data Resource

Name:General data analysis
Entity Type:unknown
Description:General data analysis
Physical Structure Description:
Object Name:Final import and analysis - for EDI.Rmd
Size:42109 bytes
Authentication:8bc5962cf3b6c3163e52b845aa3bb959 Calculated By MD5
Externally Defined Format:
Format Name:text/x-markdown
Data:https://pasta-s.lternet.edu/package/data/eml/edi/368/1/216ee52714803afb8b0e2f3c773d554e

Non-Categorized Data Resource

Name:R2 analysis
Entity Type:unknown
Description:R2 analysis
Physical Structure Description:
Object Name:R2_analysis_EDI.Rmd
Size:10852 bytes
Authentication:10dcf23491f6140dfd9d154207734e1d Calculated By MD5
Externally Defined Format:
Format Name:text/x-markdown
Data:https://pasta-s.lternet.edu/package/data/eml/edi/368/1/3c29d650c9da6cf17d72a7284c1fb978

Non-Categorized Data Resource

Name:Abstract review
Entity Type:unknown
Description:Abstract review
Physical Structure Description:
Object Name:Venn diagrams.Rmd
Size:3691 bytes
Authentication:7533273d63236f1591d9086d57f600dd Calculated By MD5
Externally Defined Format:
Format Name:text/x-markdown
Data:https://pasta-s.lternet.edu/package/data/eml/edi/368/1/c1ee84c9caf108bae15915aee9b6f61a

Data Package Usage Rights

This information is released under the Creative Commons license - Attribution - CC BY (https://creativecommons.org/licenses/by/4.0/). The consumer of these data (\"Data User\" herein) is required to cite it appropriately in any publication that results from its use. The Data User should realize that these data may be actively used by others for ongoing research and that coordination may be necessary to prevent duplicate publication. The Data User is urged to contact the authors of these data if any questions about methodology or results occur. Where appropriate, the Data User is encouraged to consider collaboration or co-authorship with the authors. The Data User should realize that misinterpretation of data may occur if used out of context of the original study. While substantial efforts are made to ensure the accuracy of data and associated documentation, complete accuracy of data sets cannot be guaranteed. All data are made available \"as is.\" The Data User should be aware, however, that data are updated periodically and it is the responsibility of the Data User to check for new versions of the data. The data authors and the repository where these data were obtained shall not be liable for damages resulting from any use or misinterpretation of the data. Thank you.

Keywords

By Thesaurus:
(No thesaurus)data assimilation, decision support, ecological predictability, forecast automation, forecast horizon, forecast skill, forecast uncertainty, iterative forecasting, near-term forecast, null model, open science, systematic review, uncertainty partitioning

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package
Description:

We systematically reviewed literature on near-term ecological forecasting to both determine how best practices have been implemented over time and compare forecastability across scales and variables. First, we searched the Web of ScienceTM Core Collection [v.5.34] database (Clarivate Analytics, Philadelphia, USA) and reviewed abstracts to identify papers that report near-term ecological forecasts (described in Literature search below). Two reviewers then independently read and analyzed each selected paper using a standardized matrix of criteria (Matrix analysis) and we recorded forecast skill when reported (Forecast skill).

Literature search (Dataset 1) We began by querying Web of Science Core Collection [v.5.34] for “forecast*” in the title, abstract, or keywords of papers published in 301 ecological journals, then manually screened abstracts of all resulting papers. We conducted the Web of Science search on 18 May 2020 and limited the search to articles and proceedings papers (hereafter, ‘papers’) published in English. This yielded 2711 results.

We screened the abstracts of all 2711 papers and selected those that met three criteria: (1) Papers had to include at least one forecast, which we defined as a prediction of future conditions from the perspective of the model; forecasts could be developed retroactively (i.e., “hindcasts”) but could only use driver data that were available before the forecast date (e.g., forecasted or time-lagged driver variables). (2) The forecast had to be near-term, which we defined as predicting ≤ 10 years into the future. (3)The forecast had to be ecological, which we defined as predicting a biogeochemical, population, or community response variable. This definition therefore excludes physical (e.g., streamflow or water temperature) and meteorological forecasts. Forecasts of human disease were only included if there was an animal vector. If the abstract indicated that the paper met all three criteria, it was moved to a second round of screening. Here, a second reviewer read the full paper to ensure that at least one forecast in the paper met all three criteria. Through this screening process, we identified 142 near-term ecological forecasting papers out of the 2711 Web of Science results.

Because ecological forecasts may be published in journals that are not categorized as “ecological” by Web of Science, we then searched all papers that were cited by the near-term ecological forecast papers we identified, as well as all papers that cited these studies. We selected those that were published in English and included “forecast*” in the title, abstract, or keywords, then screened the abstracts to ensure they met our three criteria. Finally, we read the papers themselves for confirmation. Through this second screening we identified an additional 110 near-term ecological forecasting papers.

Matrix analysis (Dataset 2) We analyzed each of the 252 papers selected in our systematic search using a standardized matrix of questions. This matrix was co-developed over several months of iteration and discussion by all authors within an Ecological Forecasting graduate seminar at Virginia Tech (January–May 2020). The final matrix used for this study included 65 questions about the model, evaluation, cyberinfrastructure, archiving, and decision support: descriptions of all questions and results are presented in dataset 2. Throughout the graduate seminar, we read and analyzed 10 papers as a group, ensuring that all reviewers understood how to interpret and answer questions in a consistent manner. Reviewers also screened several papers individually and checked their responses with another reviewer prior to the start of this analysis, helping to ensure consistency between reviewers. For the analysis described in this paper, all 252 papers were read and analyzed independently by two reviewers, and reviewers then compared any differing answers to reach consensus on a final set of responses for each paper.

During the intensive matrix analysis, 74 papers were determined to not meet our criteria of being near-term ecological forecasts, despite having passed the initial rounds of screening—these papers typically used one or more data sources that would not have been possible to know before the forecast date, and were difficult to identify without reading the entire paper and its supplement in detail. These papers were excluded from the analysis, leaving 178 papers in the final dataset (dataset 2).

Forecast skill (Dataset 3) We gathered all Pearson's r and R2 data reported in papers in the matrix analysis dataset. Pearson’s r values were squared to yield R2. We recorded the number of days into the future forecasted for each value of R2, the specific forecast variable associated with that value, and whether different values of R2 came from the same model, site, or year.

People and Organizations

Publishers:
Organization:Environmental Data Initiative
Email Address:
info@environmentaldatainitiative.org
Web Address:
https://environmentaldatainitiative.org
Creators:
Individual: Abigail S. L. Lewis
Organization:Virginia Tech
Email Address:
aslewis@vt.edu
Id:https://orcid.org/0000-0001-9933-4542
Individual: Whitney M. Woelmer
Organization:Virginia Tech
Email Address:
wwoelmer@vt.edu
Id:https://orcid.org/0000-0001-5147-3877
Individual: Heather L. Wander
Organization:Virginia Tech
Email Address:
hwander@vt.edu
Id:https://orcid.org/0000-0002-3762-6045
Individual: Dexter W. Howard
Organization:Virginia Tech
Email Address:
dwh1998@vt.edu
Id:https://orcid.org/0000-0002-6118-2149
Individual: John W. Smith
Organization:Virginia Tech
Email Address:
wsjohn2@vt.edu
Id:https://orcid.org/0000-0002-1564-3290
Individual: Ryan P. McClure
Organization:Virginia Tech
Email Address:
Ryan333@vt.edu
Id:https://orcid.org/0000-0001-6370-3852
Individual: Mary E. Lofton
Organization:Virginia Tech
Email Address:
melofton@vt.edu
Id:https://orcid.org/0000-0003-3270-1330
Individual: Nicholas W. Hammond
Organization:Virginia Tech
Email Address:
hammondnw@vt.edu
Id:https://orcid.org/0000-0003-2975-8280
Individual: Rachel S. Corrigan
Organization:Virginia Tech
Email Address:
rachelc1@vt.edu
Id:https://orcid.org/0000-0001-6101-8085
Individual: R. Quinn Thomas
Organization:Virginia Tech
Email Address:
rqthomas@vt.edu
Id:https://orcid.org/0000-0003-1282-7825
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Contacts:
Individual: Abigail S. L. Lewis
Organization:Virginia Tech
Email Address:
aslewis@vt.edu
Id:https://orcid.org/0000-0001-9933-4542

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:

Time Period
Begin:
1932-11-01
End:
2020-05-18
Geographic Region:
Description:Dataset extent
Bounding Coordinates:
Northern:  70.45276Southern:  -69.57416
Western:  -160Eastern:  159

Project

Parent Project Information:

Title:Collaborative Research: Consequences of changing oxygen availability for carbon cycling in freshwater ecosystems
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1753639
Related Project:
Title:SCC-IRG Track 2: Resilient Water Systems: Integrating Environmental Sensor Networks and Real-Time Forecasting to Adaptively Manage Drinking Water Quality and Build Social Trust
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1737424
Related Project:
Title:MSA: Macrosystems EDDIE: An undergraduate training program in macrosystems science and ecological forecasting
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1926050
Related Project:
Title:Collaborative Research: CIBR: Cyberinfrastructure Enabling End-to-End Workflows for Aquatic Ecosystem Forecasting
Personnel:
Individual: Cayelan C. Carey
Organization:Virginia Tech
Email Address:
cayelan@vt.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1933016
Related Project:
Title:Collaborative Research: CIBR: Cyberinfrastructure Enabling End-to-End Workflows for Aquatic Ecosystem Forecasting
Personnel:
Individual: Renato Figueiredo
Organization:University of Florida
Email Address:
renato@acis.ufl.edu
Id:https://orcid.org/0000-0001-8835-4476
Role:Principal Investigator
Funding: National Science Foundation 1933102
Related Project:
Title:Graduate Research Fellowship
Personnel:
Individual: Abigail S. L. Lewis
Organization:Virginia Tech
Email Address:
aslewis@vt.edu
Id:https://orcid.org/0000-0001-9933-4542
Role:Principal Investigator
Funding: National Science Foundation DGE-1651272
Related Project:
Title:Graduate Research Fellowship
Personnel:
Individual: Whitney M. Woelmer
Organization:Virginia Tech
Email Address:
wwoelmer@vt.edu
Id:https://orcid.org/0000-0001-5147-3877
Role:Principal Investigator
Funding: National Science Foundation DGE-1651272

Maintenance

Maintenance:
Description:complete
Frequency:
Other Metadata

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

UNM logo UW-M logo