yentinglin/TaiwanChat
收藏Hugging Face2024-05-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/yentinglin/TaiwanChat
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- zh
license: cc-by-nc-4.0
size_categories:
- 100K<n<1M
task_categories:
- conversational
- text-generation
- text2text-generation
pretty_name: Traditional Chinese Instruction-tuning Set
dataset_info:
features:
- name: conversations
list:
- name: from
dtype: string
- name: value
dtype: string
- name: id
dtype: string
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: train
num_bytes: 1252451454.8415947
num_examples: 485432
download_size: 677984544
dataset_size: 1252451454.8415947
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
<img src="https://cdn-uploads.huggingface.co/production/uploads/5df9c78eda6d0311fd3d541f/CmusIT5OlSXvFrbTJ7l-C.png" alt="Taiwan LLM Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
## Performance

## Citation
If you find Taiwan LLM is useful in your work, please cite it with:
```
@misc{lin2023taiwan,
title={Taiwan LLM: Bridging the Linguistic Divide with a Culturally Aligned Language Model},
author={Yen-Ting Lin and Yun-Nung Chen},
year={2023},
eprint={2311.17487},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
提供机构:
yentinglin
原始信息汇总
数据集概述
基本信息
- 语言: 中文
- 许可证: CC-BY-NC-4.0
- 大小分类: 100K<n<1M
- 任务分类:
- 对话
- 文本生成
- 文本到文本生成
- 美观名称: 传统中文指令调整集
数据集详情
- 特征:
- conversations:
- from: 字符串
- value: 字符串
- id: 字符串
- messages:
- content: 字符串
- role: 字符串
- conversations:
- 分割:
- train:
- 字节数: 1252451454.8415947
- 示例数: 485432
- train:
- 下载大小: 677984544
- 数据集大小: 1252451454.8415947
配置
- 默认配置:
- 数据文件:
- 分割: train
- 路径: data/train-*
- 数据文件:



