nguyenthanhasia/sino-xenic-reasoning-gap-dataset
收藏Hugging Face2025-12-12 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nguyenthanhasia/sino-xenic-reasoning-gap-dataset
下载链接
链接失效反馈官方服务:
资源简介:
这是一个全面的评估数据集,用于测试大型语言模型对汉语、日语、韩语和越南语中Sino-Xenic语言现象的理解能力。数据集包含297个样本,覆盖11个类别,包括汉语成语、汉字、日语汉字、韩语汉字词、越南语汉字词等。每个样本包含唯一标识符、问题或任务、语言学元素、预期解释、语言现象标签、类别、语言和任务类型等信息。数据集支持通过Hugging Face Datasets或CSV文件加载,并已通过质量保证检查。
A comprehensive evaluation dataset for testing Large Language Models understanding of Sino-Xenic linguistic phenomena across Chinese, Japanese, Korean, and Vietnamese. The dataset contains 297 samples covering 11 categories including Chinese idioms, Chinese characters, Japanese kanji, Korean sino words, Vietnamese sino words, etc. Each sample includes a unique identifier, prompt, linguistic data, expected explanation, linguistic phenomena tags, category, language, and task type. The dataset can be loaded via Hugging Face Datasets or CSV files and has passed quality assurance checks.
提供机构:
nguyenthanhasia



