jayshah5696/alpaca-small-gujarati
收藏Hugging Face2023-12-25 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jayshah5696/alpaca-small-gujarati
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
---
Original data source - [https://huggingface.co/datasets/tatsu-lab/alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca)
Used Google Translate API to translate the dataset into Gujarati.
### Data Instances
An example of "train" looks as follows:
```json
{
"instruction": "Identify the odd one out.",
"input": "Twitter, Instagram, Telegram",
"output": "Telegram",
"text": "Below is an instruction that describes a task...",
"gujarati_instruction": "વિષમને ઓળખો.",
"gujarati_input": "ટ્વિટર, ઇન્સ્ટાગ્રામ, ટેલિગ્રામ",
"gujarati_output": "ટેલિગ્રામ"
}
```
### Data Fields
The data fields are as follows:
* `instruction`: describes the task the model should perform. Each of the 52K instructions is unique.
* `input`: optional context or input for the task. For example, when the instruction is "Summarize the following article", the input is the article. Around 40% of the examples have an input.
* `output`: the answer to the instruction as generated by `text-davinci-003`.
* `text`: the `instruction`, `input` and `output` formatted with the [prompt template](https://github.com/tatsu-lab/stanford_alpaca#data-release) used by the authors for fine-tuning their models.
* `gujarati_instruction`: Gujarati translation of the instruction
* `gujarati_input`: Gujarati translation of the input
* `gujarati_output`: Gujarati translation of the output
### Data Splits
| | train |
|---------------|------:|
| alpaca | 88 |
提供机构:
jayshah5696
原始信息汇总
数据实例
一个 "train" 的示例如下:
json { "instruction": "Identify the odd one out.", "input": "Twitter, Instagram, Telegram", "output": "Telegram", "text": "Below is an instruction that describes a task...", "gujarati_instruction": "વિષમને ઓળખો.", "gujarati_input": "ટ્વિટર, ઇન્સ્ટાગ્રામ, ટેલિગ્રામ", "gujarati_output": "ટેલિગ્રામ" }
数据字段
数据字段如下:
instruction: 描述模型应执行的任务。52K条指令中的每一条都是唯一的。input: 任务的上下文或输入(可选)。例如,当指令是 "Summarize the following article" 时,输入是文章。大约40%的示例有输入。output:text-davinci-003生成的指令答案。text: 使用作者用于微调模型的提示模板格式化的instruction,input和output。gujarati_instruction: 指令的古吉拉特语翻译gujarati_input: 输入的古吉拉特语翻译gujarati_output: 输出的古吉拉特语翻译
数据分割
| train | |
|---|---|
| alpaca | 88 |



