five

jhu-clsp/core17-instructions-mteb

收藏
Hugging Face2024-11-05 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/jhu-clsp/core17-instructions-mteb
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: corpus data_files: - path: corpus/corpus-* split: corpus - config_name: queries data_files: - path: queries/queries-* split: queries - config_name: instruction data_files: - path: instruction/instruction-* split: instruction - config_name: default data_files: - path: data/default-* split: test - config_name: qrel_diff data_files: - path: qrel_diff/qrel_diff-* split: qrel_diff - config_name: top_ranked data_files: - path: top_ranked/top_ranked-* split: top_ranked dataset_info: - config_name: corpus features: - dtype: string name: _id - dtype: string name: title - dtype: string name: text splits: - name: corpus num_examples: 19899 - config_name: queries features: - dtype: string name: _id - dtype: string name: text splits: - name: queries num_examples: 40 - config_name: instruction features: - dtype: string name: query-id - dtype: string name: instruction splits: - name: instruction num_examples: 40 - config_name: default features: - dtype: string name: query-id - dtype: string name: corpus-id - dtype: float64 name: score splits: - name: test num_examples: 9480 - config_name: qrel_diff features: - dtype: string name: query-id - list: string name: corpus-ids splits: - name: qrel_diff num_examples: 20 - config_name: top_ranked features: - dtype: string name: query-id - list: string name: corpus-ids splits: - name: top_ranked num_examples: 40 language: - en multilinguality: - monolingual tags: - text-retrieval - instruction-retrieval task_categories: - text-retrieval task_ids: - document-retrieval --- # core17-instructions-mteb This is a new version of the core17-instructions dataset modified to fit the new MTEB format. 1. Restructured queries to include both original and changed versions 2. Separated instructions into a dedicated configuration 3. Reorganized qrels into default (original) and qrel_diff configurations ## Dataset Structure The dataset contains the following configurations: - corpus: Original corpus documents - queries: Queries with both original and changed versions - instruction: Instructions for both original and changed queries - default: Original relevance judgments - qrel_diff: Changes in relevance judgments - top_ranked: Top ranked documents for each query
提供机构:
jhu-clsp
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作