hon9kon9ize/yue-alpaca
收藏Hugging Face2024-04-20 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/hon9kon9ize/yue-alpaca
下载链接
链接失效反馈官方服务:
资源简介:
---
language: yue
license: cc-by-nc-4.0
size_categories:
- 1K<n<10K
tags:
- sft
- alpaca
dataset_info:
features:
- name: instruction
dtype: string
- name: input
dtype: string
- name: output
dtype: string
splits:
- name: train
num_bytes: 6906174
num_examples: 18649
download_size: 4606648
dataset_size: 6906174
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# 廣東話草泥馬
## Dataset Card for Cantonese Alpaca

- repository: (https://github.com/hon9kon9ize/yue-alpaca)
## Dataset Description
This dataset contains Cantonese Instruction-Following generated by Gemini Pro using [Stanford's Alpaca](https://github.com/tatsu-lab/stanford_alpaca) prompts for fine-tuning LLMs.
Attention: This dataset is generated by Gemini Pro and has not undergone rigorous verification. The content may contain errors. Please keep this in mind when using it.
## Licensing Information
The dataset is available under the [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
## Citation Information
```
@misc{alpaca,
author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
title = {Stanford Alpaca: An Instruction-following LLaMA model},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}
```
提供机构:
hon9kon9ize
原始信息汇总
广东话草泥马数据集
数据集描述
该数据集包含由Gemini Pro使用Stanfords Alpaca生成的广东话指令跟随数据,用于微调大型语言模型(LLMs)。
注意:此数据集由Gemini Pro生成,未经严格验证,内容可能包含错误。请在使用时注意这一点。
数据集信息
特征
- instruction: 数据类型为字符串
- input: 数据类型为字符串
- output: 数据类型为字符串
分割
- train:
- 字节数: 6906174
- 样本数: 18649
大小
- 下载大小: 4606648
- 数据集大小: 6906174
配置
- default:
- 数据文件:
- 分割: train
- 路径: data/train-*
- 数据文件:
许可信息
该数据集在Creative Commons NonCommercial (CC BY-NC 4.0)许可下提供。
引用信息
@misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {url{https://github.com/tatsu-lab/stanford_alpaca}}, }



