five

large-traversaal/urdu_datasets

收藏
Hugging Face2025-11-13 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/large-traversaal/urdu_datasets
下载链接
链接失效反馈
官方服务:
资源简介:
这个数据集包含了六个不同的配置:Urdu Alpaca、Urdu Chat Alpaca、Urdu CommonsenseQA、Urdu GSM8k、Urdu Instruct和Urdu OpenBookQA。每个数据集都由训练集组成,包含对话内容(content)和角色(role)两种类型的字符串信息。具体数据集的大小和示例数量如下:Urdu Alpaca有28910个示例,大小为52406517字节;Urdu Chat Alpaca有19997个示例,大小为119128301字节;Urdu CommonsenseQA有9741个示例,大小为2746558字节;Urdu GSM8k有7473个示例,大小为6713749字节;Urdu Instruct有51686个示例,大小为15365764字节;Urdu OpenBookQA有4957个示例,大小为1424798字节。

The dataset consists of six different configurations: Urdu Alpaca, Urdu Chat Alpaca, Urdu CommonsenseQA, Urdu GSM8k, Urdu Instruct, and Urdu OpenBookQA. Each dataset is composed of a training set, including string information of conversation content (content) and role (role). The size and number of examples for each dataset are as follows: Urdu Alpaca has 28,910 examples with a size of 52,406,517 bytes; Urdu Chat Alpaca has 19,997 examples with a size of 119,128,301 bytes; Urdu CommonsenseQA has 9,741 examples with a size of 2,746,558 bytes; Urdu GSM8k has 7,473 examples with a size of 6,713,749 bytes; Urdu Instruct has 51,686 examples with a size of 15,365,764 bytes; Urdu OpenBookQA has 4,957 examples with a size of 1,424,798 bytes.
提供机构:
large-traversaal
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作