CohereForAI/include-base-44
收藏Hugging Face2025-03-05 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/CohereForAI/include-base-44
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个多语言数据集,设计用于文本生成和多项选择任务。它支持多种语言,包括阿尔巴尼亚语、阿拉伯语、亚美尼亚语、阿塞拜疆语、巴斯克语、白俄罗斯语、孟加拉语、保加利亚语、中文、克罗地亚语、荷兰语、荷兰语-佛兰德语、爱沙尼亚语、芬兰语、法语、德语、希腊语、希伯来语、印地语、匈牙利语、印尼语、意大利语、日语、哈萨克语、韩语、立陶宛语、马来语、马拉雅拉姆语、尼泊尔语、北马其顿语、波斯语、波兰语、葡萄牙语、俄语、塞尔维亚语、西班牙语、他加禄语、泰米尔语和泰卢固语等。每个语言的配置包含语言、国家、领域、主题、区域特征、级别、问题、选项和答案等特征。数据集还提供了每个配置的测试集详细信息,包括字节数、示例数、下载大小和数据集大小。
This dataset is a multilingual dataset supporting Albanian, Arabic, Armenian, Azerbaijani, Belarusian, Bengali, Basque, Bulgarian, Turkish, Croatian, Dutch, Persian, Spanish, Estonian, Finnish, French, German, Greek, Georgian, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Kazakh, Korean, Lithuanian, Malayalam, Malay, Nepali, Polish, Portuguese, Russian, Tamil, Tagalog, Telugu, Ukrainian, Urdu, Uzbek, Vietnamese, Chinese, Serbian, and Macedonian languages. The dataset is licensed under Apache 2.0, with a size ranging from 100K to 1M. The task categories include text generation and multiple choice. Each language configuration has detailed feature descriptions, such as language, country, domain, subject, regional feature, level, question, choices, and answer.
提供机构:
CohereForAI



