BigQuery Public Datasets|公共数据集数据集|数据分析数据集
收藏
- Google首次推出BigQuery服务,作为其云平台的一部分,旨在提供大规模数据集的快速分析能力。
- BigQuery Public Datasets项目启动,Google开始提供一系列公开数据集,供研究人员和开发者免费使用。
- BigQuery Public Datasets的规模和种类显著增加,涵盖了多个领域,如气候、金融、交通等。
- Google宣布BigQuery支持实时数据分析,进一步提升了其在大数据处理领域的竞争力。
- BigQuery Public Datasets继续扩展,增加了更多高质量的数据集,支持更广泛的研究和应用场景。
- 1BigQuery Public Datasets: A Treasure Trove for Data ScientistsGoogle AI · 2018年
- 2Exploring BigQuery Public Datasets for COVID-19 ResearchGoogle Cloud · 2020年
- 3BigQuery Public Datasets: A Comprehensive Analysis of Usage and ImpactStanford University · 2021年
- 4Leveraging BigQuery Public Datasets for Financial Market AnalysisUniversity of Chicago · 2022年
- 5BigQuery Public Datasets: A Review of Recent Advances and Future DirectionsMassachusetts Institute of Technology · 2023年
Asteroids by the Minor Planet Center
包含所有已知小行星的轨道数据和观测数据。数据来源于Minor Planet Center,格式包括Fortran (.DAT)和JSON,数据集大小为81MB(压缩)和450MB(未压缩),记录数约750,000条,每日更新。
github 收录
Tropicos
Tropicos是一个全球植物名称数据库,包含超过130万种植物的名称、分类信息、分布数据、图像和参考文献。该数据库由密苏里植物园维护,旨在为植物学家、生态学家和相关领域的研究人员提供全面的植物信息。
www.tropicos.org 收录
HazyDet
HazyDet是由解放军工程大学等机构创建的一个大规模数据集,专门用于雾霾场景下的无人机视角物体检测。该数据集包含383,000个真实世界实例,收集自自然雾霾环境和正常场景中人工添加的雾霾效果,以模拟恶劣天气条件。数据集的创建过程结合了深度估计和大气散射模型,确保了数据的真实性和多样性。HazyDet主要应用于无人机在恶劣天气条件下的物体检测,旨在提高无人机在复杂环境中的感知能力。
arXiv 收录
VoxBox
VoxBox是一个大规模语音语料库,由多样化的开源数据集构建而成,用于训练文本到语音(TTS)系统。
github 收录
Infrared Thermal Image Dataset of High Voltage Electrical Power Equipment under Different Operating Conditions
Recognizing high voltage power equipment in electrical substations is the fundamental platform for effective condition monitoring of electrical power system. It enables proper identification and analysis of anomalies within the equipment, especially when in operation. The result such investigation can be applied for effective real-time measurement, control and protection schemes in the network. The use of visual images for this purpose would be limited during poor lighting conditions. However, Infrared (IR) images of the equipment are invariant to poor illumination condition. Hence, we have acquired the thermographic images of the high voltage power equipment using the portable professional FLIR C5 Infrared camera at different times of the day and load conditions. The dataset contains 5 categories of high voltages equipment common to most air-insulated electrical power substation at 132kV level, namely: circuit breakers, power transformers, surge arresters, disconnectors, and wave traps. The number of IR images for each class of equipment are: circuit breakers 203, power transformers 178, surge arresters 181, disconnectors 180, and wave traps 153. The IR images are 640 x 480 pixel RGB images captured using the rainbow color palette and properly segmented in labeled folders. The color bar in each IR image identifies the thermal range used during its acquisition. The dataset can be used for implementing novel research in computer vision based deep learning models, especially in object recognition, identification, fault classification or detection algorithms. The thermal profile of the equipment in the dataset could be applied for detection of hotspots and other related anomalies.
DataCite Commons 收录