five

jayshah5696/alpaca-small-gujarati

收藏
Hugging Face2023-12-25 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jayshah5696/alpaca-small-gujarati
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 --- Original data source - [https://huggingface.co/datasets/tatsu-lab/alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca) Used Google Translate API to translate the dataset into Gujarati. ### Data Instances An example of "train" looks as follows: ```json { "instruction": "Identify the odd one out.", "input": "Twitter, Instagram, Telegram", "output": "Telegram", "text": "Below is an instruction that describes a task...", "gujarati_instruction": "વિષમને ઓળખો.", "gujarati_input": "ટ્વિટર, ઇન્સ્ટાગ્રામ, ટેલિગ્રામ", "gujarati_output": "ટેલિગ્રામ" } ``` ### Data Fields The data fields are as follows: * `instruction`: describes the task the model should perform. Each of the 52K instructions is unique. * `input`: optional context or input for the task. For example, when the instruction is "Summarize the following article", the input is the article. Around 40% of the examples have an input. * `output`: the answer to the instruction as generated by `text-davinci-003`. * `text`: the `instruction`, `input` and `output` formatted with the [prompt template](https://github.com/tatsu-lab/stanford_alpaca#data-release) used by the authors for fine-tuning their models. * `gujarati_instruction`: Gujarati translation of the instruction * `gujarati_input`: Gujarati translation of the input * `gujarati_output`: Gujarati translation of the output ### Data Splits | | train | |---------------|------:| | alpaca | 88 |
提供机构:
jayshah5696
原始信息汇总

数据实例

一个 "train" 的示例如下:

json { "instruction": "Identify the odd one out.", "input": "Twitter, Instagram, Telegram", "output": "Telegram", "text": "Below is an instruction that describes a task...", "gujarati_instruction": "વિષમને ઓળખો.", "gujarati_input": "ટ્વિટર, ઇન્સ્ટાગ્રામ, ટેલિગ્રામ", "gujarati_output": "ટેલિગ્રામ" }

数据字段

数据字段如下:

  • instruction: 描述模型应执行的任务。52K条指令中的每一条都是唯一的。
  • input: 任务的上下文或输入(可选)。例如,当指令是 "Summarize the following article" 时,输入是文章。大约40%的示例有输入。
  • output: text-davinci-003 生成的指令答案。
  • text: 使用作者用于微调模型的提示模板格式化的 instruction, inputoutput
  • gujarati_instruction: 指令的古吉拉特语翻译
  • gujarati_input: 输入的古吉拉特语翻译
  • gujarati_output: 输出的古吉拉特语翻译

数据分割

train
alpaca 88
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作