five

中国互联网台风原始数据集(2019年)

收藏
国家地球系统科学数据中心2022-06-10 更新2024-03-04 收录
下载链接:
https://www.geodata.cn/data/datadetails.html?dataguid=24190953572457&docId=8302
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集为中国互联网台风原始数据集。基于政府官网、新闻报道和社交媒体等互联网门户和平台,收集和整理了2019年太平洋台风季中影响中国及其海域的范斯高、利奇马、罗莎和白鹿等台风相关文本数据。采用聚焦爬虫技术定向抓取相关目标文本数据,并记录其爬取时间、来源网站等基础信息。经过数据清洗,去除重复信息、纠正错误数据,获得互联网台风原始数据集。

This is a raw Chinese internet typhoon dataset. It is compiled from textual data related to typhoons that impacted China and its adjacent marine areas during the 2019 Pacific typhoon season, specifically Typhoons Francisco, Lekima, Krosa, and Bailu. The data was collected and organized from internet portals and platforms including government official websites, news reports, and social media. Targeted crawling of relevant text data was conducted using focused crawler technology, with basic metadata such as crawling timestamps and source websites recorded. Subsequently, data cleaning operations including removing duplicate entries and correcting erroneous data were performed to obtain this raw internet typhoon dataset.
提供机构:
南京师范大学
创建时间:
2022-06-10
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务