Spatial distribution data set of tea plantations with 10 m resolution from 2000 to 2020 in Fujian Province
收藏DataCite Commons2025-04-27 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=7f70b85d28194e7da8bca5ca3f1b7731
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is a 10m resolution spatial distribution dataset of tea gardens in Fujian Province from 2000 to 2020, with a spatial resolution of 10m and a projection coordinate system of WGS_ 1984_ UTM_ Zone_ 50N. Set a time node every five years, including tea garden data for five years: 2000, 2005, 2010, 2015, and 2020. Store all results in a folder named "FJ_tea_10m". The file format in the package is shp's surface data, named "tea_date. shp", such as the 2020 tea garden spatial distribution data, named "tea_2020. shp".The processing process of this dataset mainly includes five parts: data preprocessing, feature extraction, feature optimization, tea plantation classification, and interference data removal method to obtain the temporal dataset:1. Data preprocessingS1 data is preprocessed using the Sentinel-1 toolbox, which includes calibrating orbit parameters, removing boundary and thermal noise, radiometric calibration, etc., and then synthesizing the image. The S2 data is mainly processed for cloud removal, resulting in 112 cloudless S2 images and synthesized. Use correlation functions in GEE to convert terrain data with a spatial resolution of 30m to a resolution of 10m, and then crop it according to the administrative boundaries of Fujian Province. Screen Landsat long time series images from July to October 2000 to 2020 on the GEE cloud platform, and use QA bands for cloud masking, replacing cloud cover data with neighboring months.2. Feature extractionThis dataset analyzed different data sources and constructed three different feature variables, including 26 spectral features, 68 texture features, and 4 terrain features.3. Feature SelectionIn order to obtain more accurate tea garden extraction results, four experimental schemes were designed in this dataset: spectral features, spectral features+texture features, spectral features+texture features+terrain features, and SVM_ RFE feature selection, where Scheme 4 is a combination of feature selection using SVM_ RFE Feature selection algorithm selects the most important feature variable for tea garden extraction, which avoids the problem of low extraction accuracy and efficiency caused by feature redundancy.4. Classification of tea gardensSupport vector machine classifier is used to classify tea garden data, and then Confusion matrix is used to evaluate the accuracy of the four classification schemes. The main reference values are producer accuracy, user accuracy, overall accuracy, and Kappa coefficient. The accuracy verification results show that Scheme 4, after feature optimization, has the highest extraction accuracy. Finally, a 10m resolution thematic spatial distribution map of tea gardens in Fujian Province was obtained in 2020.5. Obtaining Time Series Datasets Using Interference Data Exclusion MethodThrough field investigations, it was found that the area of tea gardens in Fujian Province has been continuously increasing in the past 20 years, and the distribution of tea gardens has been continuously expanding. Therefore, this dataset adopts the interference data removal method, using the obtained vegetation interference information to mask the distribution results of tea gardens in 2020 and earlier, and sequentially obtain the spatial distribution of tea gardens in 2000, 2005, 2010, and 2015.The specific steps for implementing the interference data removal method are: based on long-term Landsat series satellite data, the LandTrender algorithm is used on the GEE cloud platform to detect changes in Landsat temporal images, obtain the time nodes for vegetation disturbance and restoration, and divide the vegetation in Fujian Province into interference and non interference areas. Set up an interference node every 5 years to merge vegetation interference information, and obtain vegetation interference information from 2000 to 2004, 2005 to 2009, 2010 to 2014, and 2015 to 2019, respectively. Taking the steps to obtain spatial distribution data of tea gardens in 2015 as an example: based on prior knowledge of tea garden expansion, overlay analysis was conducted using vegetation interference information from 2015 to 2019 and 2020 tea garden special topics, removing patterns within the scope of tea garden special topics that overlap with interference information, and obtaining spatial distribution data of tea gardens in 2015. After performing overlay analysis and elimination operations on earlier tea garden data in the above manner, a temporal dataset of tea garden spatial distribution for the years 2000, 2005, and 2010 was obtained.After on-site investigation and verification, it was found that although tea gardens have shown a gradual expansion trend, the annual changes are very small, with only a few counties and cities experiencing relatively more changes in tea gardens. And this dataset uses tea garden data with a resolution of 10m in 2020 as the mask object. After being masked and imported into Google Earth for verification, it was found that tea garden data before 2020 is basically close to a resolution of 10m. Therefore, it can be considered that the resolution of tea garden data obtained through interference data removal method from 2000 to 2015 is 10m.
本数据集为2000年至2020年福建省茶园空间分布数据集,空间分辨率为10米,投影坐标系为WGS_1984_UTM_Zone_50N。本数据集以每5年为一个时间节点,涵盖2000、2005、2010、2015及2020五个年份的茶园数据。所有结果存储于名为"FJ_tea_10m"的文件夹中,数据包内文件格式为shp格式的面状数据,命名规则为"tea_date.shp",例如2020年茶园空间分布数据命名为"tea_2020.shp"。
本数据集的处理流程主要包含五个部分:数据预处理、特征提取、特征优化、茶园分类以及干扰数据剔除方法以获取时序数据集:
1. 数据预处理
使用Sentinel-1工具箱对Sentinel-1数据进行预处理,包括轨道参数校正、边界与热噪声去除、辐射定标等操作,随后完成影像合成。针对Sentinel-2数据,主要开展去云处理,得到112景无云Sentinel-2影像并进行合成。利用谷歌地球引擎(Google Earth Engine,GEE)中的相关函数,将空间分辨率为30米的地形数据重采样至10米,随后依据福建省行政边界进行裁剪。在GEE云平台上筛选2000年至2020年7月至10月的Landsat长时序影像,利用QA波段进行云掩膜,并用相邻月份的影像替换云覆盖区域的数据。
2. 特征提取
本数据集针对多源数据进行分析,构建了三类不同的特征变量,包括26个光谱特征、68个纹理特征以及4个地形特征。
3. 特征选择
为获得更精准的茶园提取结果,本数据集设计了四组实验方案:仅光谱特征、光谱特征+纹理特征、光谱特征+纹理特征+地形特征,以及基于SVM_RFE的特征选择方案(方案四)。方案四通过SVM_RFE特征选择算法筛选出用于茶园提取的最重要特征变量,避免了特征冗余导致的提取精度与效率低下问题。
4. 茶园分类
采用支持向量机分类器对茶园数据进行分类,随后通过混淆矩阵对四组分类方案的精度进行评估,主要参考指标为生产者精度、用户精度、总体精度以及Kappa系数。精度验证结果表明,经过特征优化后的方案四提取精度最高,最终得到2020年福建省10米分辨率茶园专题空间分布图。
5. 基于干扰数据剔除方法获取时序数据集
通过野外调查发现,近20年来福建省茶园面积持续增长,茶园分布范围不断扩张。因此本数据集采用干扰数据剔除方法,利用获取的植被干扰信息对2020年及更早的茶园分布结果进行掩膜处理,依次得到2000、2005、2010、2015年的茶园空间分布数据。
该干扰数据剔除方法的具体实现步骤为:基于长时序Landsat系列卫星数据,在GEE云平台上使用LandTrender算法检测Landsat时序影像的变化,获取植被干扰与恢复的时间节点,将福建省内植被划分为干扰区与非干扰区。以每5年为一个间隔设置干扰节点,合并植被干扰信息,分别得到2000-2004年、2005-2009年、2010-2014年及2015-2019年的植被干扰信息。以2015年茶园空间分布数据的获取步骤为例:基于茶园扩张的先验知识,利用2015-2019年的植被干扰信息与2020年茶园专题数据进行叠加分析,剔除茶园专题数据中与干扰信息重叠的区域,得到2015年茶园空间分布数据。按照上述方式对更早年份的茶园数据进行叠加分析与剔除操作后,即可得到2000、2005、2010年的茶园空间分布时序数据集。
经现场调查验证,尽管茶园整体呈逐步扩张趋势,但年度变化幅度极小,仅少数县市的茶园变化量相对较大。本数据集以2020年10米分辨率的茶园数据作为掩膜对象,经掩膜处理后导入谷歌地球(Google Earth)进行验证,发现2020年之前的茶园数据分辨率基本接近10米。因此可认为,通过干扰数据剔除方法得到的2000-2015年茶园数据分辨率均为10米。
提供机构:
Science Data Bank
创建时间:
2023-06-13
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集提供了福建省2000年至2020年每五年间隔(包括2000、2005、2010、2015、2020年)的茶园空间分布数据,空间分辨率为10米,采用WGS_1984_UTM_Zone_50N投影坐标系,数据格式为shp面数据。数据集通过结合多源遥感数据(如Sentinel-1、Sentinel-2和Landsat)和机器学习方法(支持向量机分类器)生成,并利用干扰数据去除方法处理时间序列变化,确保了高精度和时效性,适用于茶园监测、林业规划和空间分析研究。
以上内容由遇见数据集搜集并总结生成



