five

Waterhackweek2019 Data Access and Time-series Statistics Cyberseminar

收藏
doi.org2019-04-01 更新2025-03-26 收录
下载链接:
https://doi.org/10.4211/hs.9985b3cb38c94cee872b28f6dcdef739
下载链接
链接失效反馈
官方服务:
资源简介:
Data about water are found in many types of formats distributed by many different sources and depicting different spatial representations such as points, polygons and grids. How do we find and explore the data we need for our specific research or application? This seminar will present common challenges and strategies for finding and accessing relevant datasets, focusing on time series data from sites commonly represented as fixed geographical points. This type of data may come from automated monitoring stations such as river gauges and weather stations, from repeated in-person field observations and samples, or from model output and processed data products. We will present and explore useful data catalogs, including the CUAHSI HIS catalog accessible via HydroClient, CUAHSI HydroShare, the EarthCube Data Discovery Studio, Google Dataset search, and agency-specific catalogs. We will also discuss programmatic data access approaches and tools in Python, particularly the ulmo data access package, touching on the role of community standards for data formats and data access protocols. Once we have accessed datasets we are interested in, the next steps are typically exploratory, focusing on visualization and statistical summaries. This seminar will illustrate useful approaches and Python libraries used for processing and exploring time series data, with an emphasis on the distinctive needs posed by temporal data. Core Python packages used include Pandas, GeoPandas, Matplotlib and the geospatial visualization tools introduced at the last seminar. Approaches presented can be applied to other data types that can be summarized as single time series, such as averages over a watershed or data extracts from a single cell in a gridded dataset – the topic for the next seminar. Cyberseminar recording is available on Youtube at https://youtu.be/uQXuS1AB2M0

关于水资源的数据以多种格式呈现,并由众多不同的来源提供,涵盖了诸如点、多边形和网格等不同的空间表示形式。我们如何找到并探索针对特定研究或应用所需的数据?本研讨会将介绍寻找和获取相关数据集的常见挑战和策略,重点关注以固定地理点为代表的站点的时间序列数据。此类数据可能来源于自动监测站,如河流水位计和气象站,反复进行的人现场观测和样本收集,或模型输出和数据处理产品。我们将展示并探讨有用的数据目录,包括可通过HydroClient访问的CUAHSI HIS目录、CUAHSI HydroShare、EarthCube数据发现工作室、Google数据集搜索以及特定机构的目录。我们还将讨论程序化数据访问方法和工具,特别是在Python中,特别是ulmo数据访问包,涉及数据格式和数据访问协议的社区标准。一旦我们获取了感兴趣的数据集,接下来的步骤通常是探索性的,侧重于可视化和统计摘要。本研讨会将展示用于处理和探索时间序列数据的有用方法及Python库,特别强调时间数据所提出的独特需求。核心Python包包括Pandas、GeoPandas、Matplotlib以及在上次研讨会上介绍的地理空间可视化工具。所提出的方法可应用于其他可总结为单一时间序列的数据类型,例如流域的平均值或网格数据集中单个单元格的数据提取——这是下次研讨会的主题。网络研讨会录音可在YouTube上观看,链接为https://youtu.be/uQXuS1AB2M0。
提供机构:
doi.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作