happyme531/classical-chinese-poetry-benchmark-70

Name: happyme531/classical-chinese-poetry-benchmark-70
Creator: happyme531
Published: 2024-12-06 16:46:58
License: 暂无描述

Hugging Face2024-12-06 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/happyme531/classical-chinese-poetry-benchmark-70

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个专门用于评测大语言模型在中国古诗词理解和生成方面能力的基准测试集。该基准包含了一个多样化的测试数据集和完整的评测框架，可用于系统性地评估和比较不同模型在古诗词领域的表现。数据集(`poetry_benchmark.jsonl`)包含70个测试样本，涵盖对联补全、诗句填空、诗词识别、提示词补全、首尾互补等题型，难度等级分为简单、中等和困难，朝代覆盖从先秦至近现代，包括唐、宋、元、明、清等重要朝代。

This is a benchmark designed to evaluate large language models capabilities in understanding and generating Classical Chinese poetry. It includes a diverse test dataset and a comprehensive evaluation framework for systematically assessing and comparing different models performance in the domain of Classical Chinese poetry. The dataset (`poetry_benchmark.jsonl`) contains 70 test samples covering the following dimensions: Question Types（Couplet completion, Poetry line filling, Poetry identification, Hint-based completion, First-last word completion）, Difficulty Levels（Easy, Medium, Hard）, Dynasty Coverage（From Pre-Qin to Modern Era, Including Tang, Song, Yuan, Ming, Qing dynasties）. The framework evaluates models comprehensively across overall accuracy, performance by question type, performance by difficulty level, and mastery of poetry from different dynasties.

提供机构：

happyme531

5,000+

优质数据集

54 个

任务类型

进入经典数据集