five

illuin-conteb/geography

收藏
Hugging Face2025-05-30 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/illuin-conteb/geography
下载链接
链接失效反馈
官方服务:
资源简介:
ConTEB地理学数据集是ConTEB(上下文感知文本嵌入基准)的一部分,旨在评估上下文嵌入模型的性能。该数据集聚焦于地理学主题,特别源自世界各地大城市的维基百科页面。数据集设计用于引发上下文信息,包括经过精心挑选的原始文档、来自这些文档的分块以及生成的查询。共有530个文档,2291个分块和5283个查询。数据集分为三个配置:documents、queries和queries-filtered,每个配置都有其特征和训练集信息。

The ConTEB Geography dataset is part of the ConTEB (Context-aware Text Embedding Benchmark) designed to evaluate the capabilities of contextual embedding models. Focusing on the theme of Geography, it originates from Wikipedia pages of cities around the world. The dataset is designed to elicit contextual information, including a curated set of original documents, chunks derived from them, and generated queries. It consists of 530 documents, 2291 chunks, and 5283 queries. The dataset is structured into three configurations: documents, queries, and queries-filtered, each with its features and training set information.
提供机构:
illuin-conteb
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作