Tensoic/gooftagoo
收藏Hugging Face2024-03-16 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/Tensoic/gooftagoo
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-generation
language:
- hi
- en
tags:
- hinglish
- conversation
- hindi
---
## Hindi/Hinglish Conversation Dataset
This repository contains a dataset of conversational text in conversational hindi and hinglish(a mix of Hindi and English languages).
The Conversation Dataset contains multi-turn conversations on multiple topics usually revolving around daily real-life experiences.
A small amount of reasoning tasks have also been added (specifically COT style reasoning and coding) with about 1k samples from Openhermes 2.5.
## Caution
This dataset was generated, please note that some content may not be entirely precise or reflect expert consensus.
Users are encouraged to verify information independently for scholarly or critical purposes.
## Author
Adithya Kamath (https://twitter.com/Adi_kmt)
提供机构:
Tensoic
原始信息汇总
数据集概述
数据集名称
Hindi/Hinglish Conversation Dataset
数据集内容
- 包含对话形式的文本,主要使用印度语和印地英语(印地语和英语的混合)。
- 数据集涵盖多轮对话,话题围绕日常生活中的真实体验。
- 包含少量推理任务,约1000个样本来自Openhermes 2.5,涉及COT风格推理和编程。
语言
- 印地语(hi)
- 英语(en)
任务类别
- 文本生成
标签
- 印地英语
- 对话
- 印地语
许可证
Apache-2.0
注意事项
- 数据集为生成数据,部分内容可能不完全准确或反映专家共识。
- 建议用户在学术或批判性研究中独立验证信息。
作者
Adithya Kamath



