Data Package Metadata   View Summary

Supervised land cover classification using Google Earth Engine in Córdoba, Argentina, 2018-2020

General Information
Data Package:
Local Identifier:edi.1540.1
Title:Supervised land cover classification using Google Earth Engine in Córdoba, Argentina, 2018-2020
Alternate Identifier:DOI PLACE HOLDER
Abstract:

Land cover information is critical to scientific, economic, and public policy-making. There is a high demand for accurate and timely land cover information that affects the accuracy of all subsequent applications. The availability of Google Earth Engine (GEE), which derives temporal aggregation methods from time-series images (i.e., the use of metrics such as mean or median), has also enabled optimization of computation time, such as managing large amounts of data to obtain more accurate results. Our objective was to obtain a land cover map for the northwest of the province of Córdoba, Argentina. The study was carried out in rural communities that belong to the departments of Cruz del Eje and Ischilín, northwest of Córdoba, and have different degrees of intervention in the land cover. Sentinel 2 Level 2A images were acquired for the study area. Images available from January 1, 2018, to December 31, 2020, were sampled. To create a thematic map, the median value was calculated for the sample of images from the selected time interval. Finally, the Normalized Difference Vegetation Index (NDVI) was calculated and added to the total bands of the median image. Training polygons were placed there considering the visual features in the median image. The Random Forest algorithm was used as the classification method. To verify the quality of the classified map, a list of 97,753 verification pixels was obtained. In addition, a confusion matrix was created to collect the conflicts that arise between categories, and the precision and kappa coefficient was calculated to define the quality of the map obtained. Image acquisition, preprocessing, and analysis were performed on the Google Earth Engine platform. Thematic maps with eight classes were obtained, with a total area of 719880 ha. The confusion matrix showed an overall precision of 99.26% and a corrected kappa index of 0.99, the classes were correctly classified by the algorithm.

Publication Date:2023-12-06
For more information:
Visit: DOI PLACE HOLDER

Time Period
Begin:
2018-01-01
End:
2020-12-31

People and Organizations
Contact:Fiad, Federico Gastón (Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)) [  email ]
Creator:Fiad, Federico Gastón (Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET))
Creator:Insaurralde, Juan Ariel (Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET))
Creator:Cardozo, Miriam (Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET))
Creator:Rodríguez, Claudia Susana (Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET))
Creator:Gorla, David Eladio (Insittuto de diversidad y ecología animal (IDEA/CONICET))

Data Entities
Data Table Name:
Landcover_Classes
Description:
Define the land cover classes and the approximate area covered by each class for the map creation.
Other Name:
Scrip1_Study_Area
Description:
Selection of the images necessary to construct the median images for the whole period, wet and hot and dry and cold periods.
Other Name:
Script2_Supervised_landcover_classification
Description:
The following script presents the supervised classification analysis for the study area, the ground truth points used to train the Random Forest algorithm, the results of the Random Forest algorithm analysis, and the download of the thematic map with eight land cover classes for the landscape with the associated confusion matrix. It also includes a detailed description of each step performed so that the user can replicate the analysis or use it for their work.
Other Name:
AOI_gee
Description:
Use the vector as a layer mask to define the study area. Remember to upload this file to Assets for running the script in Google Earth Engine.
Detailed Metadata

Data Entities


Data Table

Data:https://pasta-s.lternet.edu/package/data/eml/edi/1540/1/90d39e3fb8e1273995c377dcbd2e8f21
Name:Landcover_Classes
Description:Define the land cover classes and the approximate area covered by each class for the map creation.
Number of Records:8
Number of Columns:5

Table Structure
Object Name:Classes.csv
Size:1567 byte
Authentication:b7c343c8e21d253c813e1d4c242ca753 Calculated By MD5
Text Format:
Number of Header Lines:1
Record Delimiter:\r\n
Orientation:column
Simple Delimited:
Field Delimiter:;
Quote Character:"

Table Column Descriptions
 Macro-classClassValueDescriptionCovered area (Ha)
Column Name:Macro-class  
Class  
Value  
Description  
Covered area (Ha)  
Definition:Macroclasses that group microclasses of land coverMicroclasses identified in the land cover mapCode with which the microclass is recognized within the Google Earth Engine algorithmCategorical definition that represents each microclass and how to identify it in the satellite imageArea covered by each microclass in the thematic land cover map expressed in hectares
Storage Type:string  
string  
string  
string  
float  
Measurement Type:nominalnominalnominalnominalratio
Measurement Values Domain:
Allowed Values and Definitions
Enumerated Domain 
Code Definition
CodeAnthropic
DefinitionCoverage class of anthropic origin
Source
Code Definition
CodeForest
DefinitionCoverage class characterized by high tree density
Source
Code Definition
CodeShrubland
DefinitionCoverage class characterized by high bush density
Source
Code Definition
CodeWater
DefinitionWater bodies
Source
Definitiontext
Allowed Values and Definitions
Enumerated Domain 
Code Definition
Code1
DefinitionClosed_Forest
Source
Code Definition
Code2
DefinitionOpen_Forest
Source
Code Definition
Code3
DefinitionClosed_Shrubland
Source
Code Definition
Code4
DefinitionOpen_Shrubland
Source
Code Definition
Code5
DefinitionNo vegetable area
Source
Code Definition
Code6
DefinitionCrops
Source
Code Definition
Code7
DefinitionCattle pastures
Source
Code Definition
Code8
DefinitionWater
Source
Definitiontext
Unithectare
Typeinteger
Missing Value Code:          
Accuracy Report:          
Accuracy Assessment:          
Coverage:          
Methods:          

Non-Categorized Data Resource

Name:Scrip1_Study_Area
Entity Type:text/plain
Description:Selection of the images necessary to construct the median images for the whole period, wet and hot and dry and cold periods.
Physical Structure Description:
Object Name:Scrip1_Study_Area.txt
Size:5742 byte
Authentication:b0e66e90abfca1da216920668b88c778 Calculated By MD5
Externally Defined Format:
Format Name:text/plain
Data:https://pasta-s.lternet.edu/package/data/eml/edi/1540/1/ace5421baa2d6885931b750a8bad141d

Non-Categorized Data Resource

Name:Script2_Supervised_landcover_classification
Entity Type:text/plain
Description:The following script presents the supervised classification analysis for the study area, the ground truth points used to train the Random Forest algorithm, the results of the Random Forest algorithm analysis, and the download of the thematic map with eight land cover classes for the landscape with the associated confusion matrix. It also includes a detailed description of each step performed so that the user can replicate the analysis or use it for their work.
Physical Structure Description:
Object Name:Script2_Supervised_landcover_classification.txt
Size:29908 byte
Authentication:66cbdc5b1359d9d612b7329fc9261619 Calculated By MD5
Externally Defined Format:
Format Name:text/plain
Data:https://pasta-s.lternet.edu/package/data/eml/edi/1540/1/401e30a364b22b48a4286a38ea5bf1ae

Non-Categorized Data Resource

Name:AOI_gee
Entity Type:application/zip
Description:Use the vector as a layer mask to define the study area. Remember to upload this file to Assets for running the script in Google Earth Engine.
Physical Structure Description:
Object Name:AOI_gee.zip
Size:161214 byte
Authentication:becd9380653c3419796fa0a72948844f Calculated By MD5
Externally Defined Format:
Format Name:application/zip
Data:https://pasta-s.lternet.edu/package/data/eml/edi/1540/1/1636619a6cda2492a7b5f8d559430ef4

Data Package Usage Rights

This information is released under the Creative Commons license - Attribution - CC BY (https://creativecommons.org/licenses/by/4.0/). The consumer of these data ("Data User" herein) is required to cite it appropriately in any publication that results from its use. The Data User should realize that these data may be actively used by others for ongoing research and that coordination may be necessary to prevent duplicate publication. The Data User is urged to contact the authors of these data if any questions about methodology or results occur. Where appropriate, the Data User is encouraged to consider collaboration or co-authorship with the authors. The Data User should realize that misinterpretation of data may occur if used out of context of the original study. While substantial efforts are made to ensure the accuracy of data and associated documentation, complete accuracy of data sets cannot be guaranteed. All data are made available "as is." The Data User should be aware, however, that data are updated periodically and it is the responsibility of the Data User to check for new versions of the data. The data authors and the repository where these data were obtained shall not be liable for damages resulting from any use or misinterpretation of the data. Thank you.

Keywords

By Thesaurus:
LTER Controlled Vocabularyland cover, conservation, disturbance, ecology, human disturbance

Methods and Protocols

These methods, instrumentation and/or protocols apply to all data in this dataset:

Methods and protocols used in the collection of this data package
Description:

Script 1: Reflectance data were collected from Sentinel 2 Level 2A imagery, with surface-level atmospheric correction, to create a thematic land cover map of the study area. The images were selected from the Copernicus image repository (ESA) and were taken from January 1, 2018, to December 31, 2020, at a spatial resolution of 20 meters. The QA60 layer was used to remove pixels corresponding to cirrus clouds and clouds. The study obtained 410 images through sampling for analysis.

To produce thematic maps, a median image was calculated for the sample of images captured during the selected time period. The Normalized Difference Vegetation Index (NDVI) was later calculated and added to the total bands of the median image.

Description:

Script 2: For the supervised classification, we considered the 36 transects which were drawn in the field at 18 homes, selected for their landscape characteristics that cover the study area, as ground truth sites. There, training areas were established based on vegetation height measurements taken at the site using the pin sampling method. The Random Forest algorithm was implemented as a classification technique.

To ensure accuracy of the classified map, a total of 97,753 verification pixels were obtained. To validate the thematic map, data collected in the field and thematic maps produced by MapBiomas (https://chaco.mapbiomas.org/) were used for visual interpretation. Subsequently, a confusion matrix was constructed to identify conflicts between categories, calculate the precision, and determine the Kappa coefficient, ultimately defining the quality of the map.

People and Organizations

Publishers:
Organization:Environmental Data Initiative
Email Address:
info@edirepository.org
Web Address:
https://edirepository.org
Id:https://ror.org/0330j0z60
Creators:
Individual: Federico Gastón Fiad
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Phone:
03512533998 (voice)
Email Address:
federico.fiad@mi.unc.edu.ar
Id:https://orcid.org/%200000-0003-2173-4305
Individual: Juan Ariel Insaurralde
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 1611,
Córdoba, Córdoba 5000 Argentina
Email Address:
insaurralde.iibyt@gmail.com
Id:https://orcid.org/0000-0002-1674-4421
Individual: Miriam Cardozo
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Email Address:
cardozo.miri@gmail.com
Id:https://orcid.org/0000-0002-4339-974X
Individual: Claudia Susana Rodríguez
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Email Address:
claudia.rodriguez@unc.edu.ar
Id:https://orcid.org/0000-0002-0096-995X
Individual: David Eladio Gorla
Organization:Insittuto de diversidad y ecología animal (IDEA/CONICET)
Address:
Rondeau 798,
Córdoba, Córdoba 5000 Argentina
Email Address:
david.gorla@conicet.gov.ar
Id:https://orcid.org/0000-0002-9323-9365
Contacts:
Individual: Federico Gastón Fiad
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Phone:
03512533998 (voice)
Email Address:
federico.fiad@mi.unc.edu.ar
Id:https://orcid.org/%200000-0003-2173-4305

Temporal, Geographic and Taxonomic Coverage

Temporal, Geographic and/or Taxonomic information that applies to all data in this dataset:

Time Period
Begin:
2018-01-01
End:
2020-12-31
Geographic Region:
Description:The study was conducted in 14 rural communities in Cruz del Eje and Ischilín departments, located in the northwestern of Córdoba province, central Argentina. This area belongs to the Dry Chaco Ecoregion. The region's topography comprises rolling hills that descend towards the Salinas Grandes, situated at elevations ranging from 450 to 200 meters above sea level. The area does not have any visible water bodies. The basin is characterized by Aridisol and hapustol soils that possess saline-alkaline properties, which occur due to low precipitation, high evaporation rates, and highly porous colluvial materials. For clarity, technical abbreviations like Aridisol and hapustol should be explained upon their first use. The region is situated in a semi-arid locale where water resources are critically scarce throughout the year. On average, the region gets 400 mm of rainfall in the east and 480 mm in the west. The annual temperature is about 18° C, with a thermal amplitude of 14° C and 244 frost-free days. Rainfall ranges from 550 mm to the west and 650 mm to the east, following a seasonal monsoon distribution. The open forest of Aspidosperma quebracho-blanco is the native vegetation. The trees within this region typically grow to a height of 6-8 meters. Black carob (Prosopis flexuosa), mistol (Ziziphus mistol), tintitaco (P. torquata), and a high occurrence of cardón (Stetsonia coryne) are notable species found in this area. The shrub layer covers between 40-70% of the ground and reaches a height of up to 4 meters. This layer is predominantly dominated by Mimozyganthus carinatus (lata), Larrea divaricata (jarilla), Acacia furcatispina (garabato), and Cercidium australe (pitch). During the early and late 1900s, the natural flora of the region underwent significant changes in appearance and composition. Currently, the primary land use involves extensive grazing of livestock, primarily cattle, sheep, and goats, using low-intensity management methods, and some poultry farming. Notable agricultural activities in the region involve horticulture and fruit production, alongside cultivation of various plants, such as aromatic, medicinal, and condiment plants, garlic, olives, cotton, and oregano. Additionally, soybean, corn, and wheat production are prominent.
Bounding Coordinates:
Northern:  -64.9494Southern:  -64.974
Western:  -30.412Eastern:  -30.3318

Project

Parent Project Information:

Title:Invasion of rural homes due to active dispersal of Triatominae
Personnel:
Individual: David Eladio Gorla
Organization:Insittuto de diversidad y ecología animal (IDEA/CONICET)
Address:
Rondeau 798,
Córdoba, Córdoba 5000 Argentina
Email Address:
david.gorla@conicet.gov.ar
Id:https://orcid.org/0000-0002-9323-9365
Role:Director
Individual: Liliana Beatríz Crocco
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Email Address:
liliana.crocco@unc.edu.ar
Id:https://orcid.org/0000-0002-1277-8157
Role:Co-Director
Individual: Federico Gastón Fiad
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Phone:
03512533998 (voice)
Email Address:
federico.fiad@mi.unc.edu.ar
Id:https://orcid.org/0000-0003-2173-4305
Role:Doctoral fellow
Individual: Miriam Cardozo
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Email Address:
cardozo.miri@gmail.com
Id:https://orcid.org/0000-0002-4339-974X
Role:Associated fellow
Individual: Claudia Susana Rodríguez
Organization:Instituto de investigaciones biológicas y tecnológicas (IIBYT/CONICET)
Address:
Vélez Sarsfield 299,
Córdoba, Córdoba 5000 Argentina
Email Address:
claudia.rodriguez@unc.edu.ar
Id:https://orcid.org/0000-0002-0096-995X
Role:Associated fellow
Abstract:

The northwestern region of the province of Córdoba, located in the southernmost part of the Argentine Gran Chaco, is a region historically known to be endemic for Chagas disease and currently at moderate risk for vector transmission of T. cruzi due to a lack of adequate vector control with low coverage. In recent decades, as a result of the expansion of the agricultural frontier, the region has undergone significant environmental changes in land use and coverage. Deforestation and the establishment of large and medium livestock exploitation corporations have led to a rise in the deforestation rate in the Chaco forest, resulting in significant disruptions to the local wildlife. These disruptions could have an unquantifiable impact on triatomines that are associated with Chaco mammals and birds. Environmental changes in various regions and vectors lead to an increase in the rate of contact between humans and vectors, resulting in a higher risk of vector-borne pathogen transmission. In this situation, where there is no information system for monitoring and controlling T. infestans populations, and the conditions of wild triatomine populations are quickly changing, this study aims to assess how often triatomines reach homes in rural areas that are actively dispersed in the northwest and west of the Córdoba province.

Additional Award Information:
Funder:FONDO PARA LA INVESTIGACION CIENT Y TECNOLOGICA (FONCYT)
Number:PICT 2016-2527
Title: AGENCIA NACIONAL DE PROMOCION CIENT Y TECNOLOGICA

Maintenance

Maintenance:
Description:

The maintenance of the data table as well as the supervised classification is subject to updates to the Google Earth Engine software.

Frequency:asNeeded
Other Metadata

Additional Metadata

additionalMetadata
        |___text '\n    '
        |___element 'metadata'
        |     |___text '\n      '
        |     |___element 'emlEditor'
        |     |        \___attribute 'app' = 'ezEML'
        |     |        \___attribute 'release' = '2023.11.29'
        |     |___text '\n    '
        |___text '\n  '

EDI is a collaboration between the University of New Mexico and the University of Wisconsin – Madison, Center for Limnology:

UNM logo UW-M logo