Artur-B/SpatioTemporal-News-Corpus
收藏Hugging Face2025-08-27 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/Artur-B/SpatioTemporal-News-Corpus
下载链接
链接失效反馈官方服务:
资源简介:
时空新闻数据集包含大约120万篇来自美国、英国、加拿大和澳大利亚主要新闻机构的英语新闻标题和文章。每条记录都标注了空间(来源国家)和时间(发布日期)上下文,旨在用于训练具有时空意识的句子嵌入,特别是我们的Space-Time-MiniLM-v0模型。数据集涵盖了2017年1月至2021年12月的时间段(60个月)。
The Spatiotemporal News Dataset contains approximately 1.2 million English-language news headlines and articles sourced from major outlets in the United States, United Kingdom, Canada, and Australia. Each entry is annotated with spatial (country of origin) and temporal (date of publication) contexts, designed for training spatiotemporal-aware sentence embeddings, specifically our Space-Time-MiniLM-v0 model. The dataset covers the period from January 2017 to December 2021 (60 months).
提供机构:
Artur-B



