five

yourbench-testing/yourbench-basic-test

收藏
Hugging Face2025-12-14 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/yourbench-testing/yourbench-basic-test
下载链接
链接失效反馈
官方服务:
资源简介:
Yourbench Basic Test数据集是一个使用YourBench(v0.6.0)开源框架从文档集合生成的领域特定基准测试数据集。该数据集包含多个配置,每个配置针对不同的处理阶段和任务设计。主要配置包括:ingested(原始文档摄取和标准化)、summarized(文档摘要)、chunked(文档分块)、single_shot_questions(单跳问题生成)、multi_hop_questions(多跳问题生成)和prepared_lighteval(为评估准备的数据)。数据集生成流程包括文档摄取、摘要生成、分块处理、单跳和多跳问题生成等步骤,旨在为自然语言处理任务提供全面的测试基准。

The Yourbench Basic Test dataset is a domain-specific benchmark generated from document collections using the YourBench (v0.6.0) open-source framework. The dataset contains multiple configurations, each designed for different processing stages and tasks. Main configurations include: ingested (raw document ingestion and normalization), summarized (document summarization), chunked (document chunking), single_shot_questions (single-hop question generation), multi_hop_questions (multi-hop question generation), and prepared_lighteval (data prepared for evaluation). The dataset generation pipeline includes steps such as document ingestion, summarization, chunking, single-shot and multi-hop question generation, aiming to provide a comprehensive test benchmark for natural language processing tasks.
提供机构:
yourbench-testing
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作