Data Package Metadata View Summary

Data and code for EDI overview paper, data collection characteristics, FAIR evaluation, downloads, and citations

General Information

Data Package:

Local Identifier:

edi.1175.1

Title:

Data and code for EDI overview paper, data collection characteristics, FAIR evaluation, downloads, and citations

Alternate Identifier:

DOI PLACE HOLDER

Abstract:

The Environmental Data Initiative (EDI) is a trustworthy, stable data repository and data management support organization for the environmental scientist. EDI provides tools and support that allow the environmental researcher to easily integrate data publishing into the research workflow. Almost ten years since going into production, these data and code were used to provide a general description of EDI’s collection of data and its data management philosophy and placement in the repository landscape. They show how comprehensive metadata and the repository infrastructure lead to highly findable, accessible, interoperable, and reusable (FAIR) data by evaluating compliance with specific community proposed FAIR criteria. Finally, they provide measures and patterns of data (re)use, assuring that EDI is fulfilling its stated premise.

Publication Date:

2022-07-21

For more information:
Visit:	DOI PLACE HOLDER

Time Period

Begin:

2022

End:

2022

People and Organizations
Contact:	Gries, Corinna (Environmental Data Initiative) [ email ]
Creator:	Gries, Corinna (Environmental Data Initiative)
Creator:	Servilla, Mark (Environmental Data Initiative)

Data Entities
Data Table Name:	datasetDurationKeywords
Description:	Length of observation as indicated by begin and end dates, list of keywords for every dataset in EDI
Data Table Name:	dl_cit_meta_package
Description:	numbers of downloads and citations linked to subjects and length of observation per dataset
Data Table Name:	edi_eml_content_long
Description:	FAIR criteria parsed from EML metadata for each data package
Data Table Name:	geog_distribution
Description:	bounding box and centroid information for each data package
Data Table Name:	keyword_count_word_edit
Description:	most frequently used keywords and number datasets they are used to describe
Data Table Name:	keyword_pairs
Description:	most commonly used keywords as they are being used together to describe datasets
Other Name:	r_code
Description:	R code scripts used for analysis

Detailed Metadata

Data Entities

Data Table


Data:	https://pasta-s.lternet.edu/package/data/eml/edi/1175/1/7898abeea2417283215c7dc0aabba356
Name:	datasetDurationKeywords
Description:	Length of observation as indicated by begin and end dates, list of keywords for every dataset in EDI
Number of Records:	8605
Number of Columns:	4

Table Structure

Object Name:

datasetDurationKeywords.csv

Size:

1853054 byte

Authentication:

08173bf6f5e0c958bd6cb0a2bda807aa Calculated By MD5

Text Format:

Number of Header Lines:

Record Delimiter:

\r\n

Orientation:

column

Simple Delimited:

Field Delimiter:	,
Quote Character:	"

Table Column Descriptions

dl_dataset_id

duration

endYear

keywords

Column Name:

dl_dataset_id

duration

endYear

keywords

Definition:

Basic dataset ID from EDI without version

Difference between begin date and end date from metadata in years

End year of observation from EML metadata

Comma separated list of keywords from EML metadata

Storage Type:

string

float

string

Measurement Type:

nominal

ratio

nominal

Measurement Values Domain:

Definition

Basic dataset ID from EDI without version

Unit	number
Type	integer

Unit	nominalYear
Type	integer

Definition

Comma separated list of keywords from EML metadata

Missing Value Code:

Code	NA
Expl	not available

Code	NA
Expl	not available

Code	NA
Expl	not available

Accuracy Report:

Accuracy Assessment:

Coverage:

Methods:

Data Table


Data:	https://pasta-s.lternet.edu/package/data/eml/edi/1175/1/f8c2bcc6588e4ac9949249ba5c4296ad
Name:	dl_cit_meta_package
Description:	numbers of downloads and citations linked to subjects and length of observation per dataset
Number of Records:	8605
Number of Columns:	14

Table Structure

Object Name:

dl_cit_meta_package.csv

Size:

527196 byte

Authentication:

480b81319d044d4542bc139462604fa8 Calculated By MD5

Text Format:

Number of Header Lines:

Record Delimiter:

\r\n

Orientation:

column

Simple Delimited:

Field Delimiter:	,
Quote Character:	"

Table Column Descriptions

scope

dataset_id

web_download

script_download

num_citations

duration

endYear

biodiversity

disturbance

primaryProd

orgMatter

inorgNutr

abiotic

allClass

Column Name:

scope

dataset_id

web_download

script_download

num_citations

duration

endYear

biodiversity

disturbance

primaryProd

orgMatter

inorgNutr

abiotic

allClass

Definition:

scope of data package in EDI

basic dataset Id without version

number of manual web downloads

Number of downloads initiated by a script or program

number of journal article, thesis or report citing this dataset

length of observation in years

end year of observations from metadata

whether or not a keyword in the group of biodiversity was found

whether or not a keyword in the group of disturbance was found

whether or not a keyword in the group of primary production was found

whether or not a keyword in the group of organic matter was found

whether or not a keyword in the group of inorganic nutrients was found

whether or not a keyword in the group of abiotic conditions was found

whether or not the dataset was classified into the main categories

Storage Type:

string

float

string

Measurement Type:

nominal

ratio

nominal

Measurement Values Domain:

Definition

scope of data package in EDI

Definition

basic dataset Id without version

Unit	number
Type	integer

Unit	number
Type	integer

Unit	number
Type	integer

Unit	number
Type	integer

Unit	nominalYear
Type	integer