CLEVA|中文自然语言处理数据集|多任务评估数据集
收藏CLEVA: 中文语言模型评估平台
数据集概述
- 名称: CLEVA (Chinese Language Models EVAluation Platform)
- 开发团队: 香港中文大学LaVi Lab与上海人工智能实验室合作开发
- 论文: EMNLP 2023 Demo
- 许可证: CC BY-NC-ND 4.0
- 最新动态: 2024-12-06发布C²LEVA双语基准测试
核心特性
-
中文基准测试
- 包含31项任务(11项应用评估+20项能力评估)
- 总计370K中文测试样本(33.98%为新收集数据)
- 有效缓解数据污染问题
-
标准化评估方法
- 统一的数据预处理流程
- 使用一致的中文提示模板集
-
可信排行榜
- 采用新测试数据进行评估
- 定期组织模型评估
- 历史评估数据开放下载
技术实现
- 集成平台: 已整合至HELM评估框架
- 评估参数:
task
: 31项任务中的任一项subtask
: 任务子类别prompt_id
: 提示模板索引version
: 数据集版本(当前仅v1)data_augmentation
: 数据增强策略(cleva/cleva_robustness/cleva_fairness)
数据获取
-
下载方式: sh bash download_data.sh
-
默认版本: v1
-
输出内容: 包含各任务数据的版本目录
引用规范
bib @misc{li2023cleva, title={CLEVA: Chinese Language Models EVAluation Platform}, author={Yanyang Li and Jianqiao Zhao and Duo Zheng and Zi-Yuan Hu and Zhi Chen and Xiaohui Su and Yongfeng Huang and Shijia Huang and Dahua Lin and Michael R. Lyu and Liwei Wang}, year={2023}, eprint={2308.04813}, archivePrefix={arXiv}, primaryClass={cs.CL} }
注意事项
- 在线评估需联系: clevaplat@gmail.com
- 本地评估推荐使用HELM框架
- 完整参数说明参见HELM文档

Apple Stock Price Data
Historical stock price data for AAPL (apple)
kaggle 收录
Population and Housing Census of 2007 - Ethiopia
Geographic coverage --------------------------- National coverage Analysis unit --------------------------- Household Person Housing unit Universe --------------------------- The census has counted people on dejure and defacto basis. The dejure population comprises all the persons who belong to a given area at a given time by virtue of usual residence, while under defacto approach people were counted as the residents of the place where they found. In the census, a person is said to be a usual resident of a household (and hence an area) if he/she has been residing in the household continuously for at least six months before the census day or intends to reside in the household for six months or longer. Thus, visitors are not included with the usual (dejure) population. Homeless persons were enumerated in the place where they spent the night on the enumeration day. The 2007 census counted foreign nationals who were residing in the city administration. On the other hand all Ethiopians living abroad were not counted. Kind of data --------------------------- Census/enumeration data [cen] Mode of data collection --------------------------- Face-to-face [f2f] Research instrument --------------------------- Two type sof questionnaires were used to collect census data: i) Short questionnaire ii) Long questionnaire Unlike the previous censuses, the contents of the short and long questionnaires were similar both for the urban and rural areas as well as for the entire city. But the short and the long questionnaires differ by the number of variables they contained. That is, the short questionnaire was used to collect basic data on population characteristics, such as population size, sex, age, language, ethnic group, religion, orphanhood and disability. Whereas the long questionnaire includes information on marital status, education, economic activity, migration, fertility, mortality, as well as housing stocks and conditions in addition to those questions contained in a short questionnaire.
catalog.ihsn.org 收录
China Health and Nutrition Survey (CHNS)
China Health and Nutrition Survey(CHNS)是一项由美国北卡罗来纳大学人口中心与中国疾病预防控制中心营养与健康所合作开展的长期开放性队列研究项目,旨在评估国家和地方政府的健康、营养与家庭计划政策对人群健康和营养状况的影响,以及社会经济转型对居民健康行为和健康结果的作用。该调查覆盖中国15个省份和直辖市的约7200户家庭、超过30000名个体,采用多阶段随机抽样方法,收集了家庭、个体以及社区层面的详细数据,包括饮食、健康、经济和社会因素等信息。自2011年起,CHNS不断扩展,新增多个城市和省份,并持续完善纵向数据链接,为研究中国社会经济变化与健康营养的动态关系提供了重要的数据支持。
www.cpc.unc.edu 收录
jpft/danbooru2023
Danbooru2023是一个大规模的动漫图像数据集,包含超过500万张由爱好者社区贡献并详细标注的图像。图像标签涵盖角色、场景、版权、艺术家等方面,平均每张图像有30个标签。该数据集可用于训练图像分类、多标签标注、角色检测、生成模型等多种计算机视觉任务。数据集基于danbooru2021构建,扩展至包含ID #6,857,737的图像,增加了超过180万张新图像,总大小约为8TB。图像以原始格式提供,分为1000个子目录,使用图像ID的模1000进行分桶,以避免文件系统性能问题。
hugging_face 收录
KAIST dataset
KAIST数据集,用于多光谱行人检测。
github 收录