Arch223/MMLU_Test_Run_first_three_subjects
收藏Hugging Face2025-04-06 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Arch223/MMLU_Test_Run_first_three_subjects
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了多种配置,每种配置具有不同的特征字段。主要包含文档的ID、文本内容、文件名、元数据(如文件大小)、文本摘要、摘要模型、文本块信息(包括块ID和文本)、多跳文本块信息、块信息度量和块切分模型等。具体配置如下:
chunked配置:包含文档ID、文本内容、文件名、元数据、原始文本摘要、文本摘要、摘要模型、文本块及其ID和文本、多跳文本块信息、块信息度量和块切分模型。
ingested配置:包含文档ID、文本内容、文件名和元数据。
lighteval配置:包含问题、真实答案、问题类别、类型、难度估计、引用、文档ID、文本块ID、问题生成模型、文本块和文档。
multi_hop_questions配置:包含文档ID、源文本块ID、问题、自我答案、难度估计、自我评估问题类型、生成模型、思考过程、引用和原始响应。
single_shot_questions配置:包含文本块ID、文档ID、问题、自我答案、难度估计、自我评估问题类型、生成模型、思考过程、原始响应和引用。
summarized配置:包含文档ID、文本内容、文件名、元数据、原始文本摘要、文本摘要和摘要模型。
The dataset consists of multiple configurations, each with different feature fields. It includes fields like document ID, text content, filename, metadata (e.g., file size), text summary, summarization model, text chunks (including chunk ID and text), multi-hop text chunks, chunk information metrics, and chunking model. Specific configurations are as follows:
chunked configuration: Includes document ID, text content, filename, metadata, raw text summary, text summary, summarization model, text chunks with IDs and text, multi-hop text chunks, chunk information metrics, and chunking model.
ingested configuration: Includes document ID, text content, filename, and metadata.
lighteval configuration: Includes question, ground truth answer, question category, type, estimated difficulty, citations, document ID, chunk IDs, question generating model, chunks, and document.
multi_hop_questions configuration: Includes document ID, source chunk IDs, question, self-answer, estimated difficulty, self-assessed question type, generating model, thought process, citations, and raw response.
single_shot_questions configuration: Includes chunk ID, document ID, question, self-answer, estimated difficulty, self-assessed question type, generating model, thought process, raw response, and citations.
summarized configuration: Includes document ID, text content, filename, metadata, raw text summary, text summary, and summarization model.
提供机构:
Arch223



