five

mFollowIR-parquet

收藏
魔搭社区2025-10-03 更新2025-09-13 收录
下载链接:
https://modelscope.cn/datasets/jhu-clsp/mFollowIR-parquet
下载链接
链接失效反馈
官方服务:
资源简介:
# mFollowIR-parquet This is a parquet version of the mFollowIR dataset that can be loaded directly with `load_dataset()`. The original dataset can be found at [jhu-clsp/mFollowIR](https://huggingface.co/datasets/jhu-clsp/mFollowIR). ## Dataset Structure The dataset contains the following configurations for each language (fas, rus, zho): ### Configurations - `qrels_og_[lang]`: Original relevance judgments (test split) - `qrels_changed_[lang]`: Modified relevance judgments (test split) - `corpus_[lang]`: Document collection - `queries_[lang]`: Query set with instructions - `top_ranked_[lang]`: Top ranked documents ## Usage ```python from datasets import load_dataset # Load a specific configuration dataset = load_dataset("jhu-clsp/mFollowIR-parquet", "queries_fas") # or any other config # Load multiple configurations dataset = load_dataset("jhu-clsp/mFollowIR-parquet", ["queries_fas", "corpus_fas"]) ``` ## Citation ```bibtex @article{weller2024mfollowir, title="mFollowIR: a Multilingual Benchmark for Instruction Following in Information Retrieval", author="Weller, Orion and Chang, Benjamin and Yang, Eugene and Yarmohammadi, Mahsa and Barham, Sam and MacAvaney, Sean and Cohan, Arman and Soldaini, Luca and Van Durme, Benjamin and Lawrie, Dawn", journal="arXiv preprint TODO", year="2024" } ```

# mFollowIR-parquet 本数据集为mFollowIR数据集的Parquet格式版本,可直接通过`load_dataset()`函数加载。原始数据集可在[jhu-clsp/mFollowIR](https://huggingface.co/datasets/jhu-clsp/mFollowIR)获取。 ## 数据集结构 本数据集针对波斯语(fas)、俄语(rus)、中文(zho)三种语言提供如下配置项: ### 配置项 - `qrels_og_[lang]`:原始相关性标注(测试集划分) - `qrels_changed_[lang]`:修改后的相关性标注(测试集划分) - `corpus_[lang]`:文档集合 - `queries_[lang]`:带指令的查询集合 - `top_ranked_[lang]`:Top排序文档 ## 用法 python from datasets import load_dataset # 加载指定配置项 dataset = load_dataset("jhu-clsp/mFollowIR-parquet", "queries_fas") # 或其他任意配置项 # 加载多个配置项 dataset = load_dataset("jhu-clsp/mFollowIR-parquet", ["queries_fas", "corpus_fas"]) ## 引用 bibtex @article{weller2024mfollowir, title="mFollowIR: a Multilingual Benchmark for Instruction Following in Information Retrieval", author="Weller, Orion and Chang, Benjamin and Yang, Eugene and Yarmohammadi, Mahsa and Barham, Sam and MacAvaney, Sean and Cohan, Arman and Soldaini, Luca and Van Durme, Benjamin and Lawrie, Dawn", journal="arXiv preprint TODO", year="2024" }
提供机构:
maas
创建时间:
2025-09-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作