amkyawdev/myanmar-llm-data
收藏Hugging Face2026-04-05 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/amkyawdev/myanmar-llm-data
下载链接
链接失效反馈官方服务:
资源简介:
---
title: Myanmar LLM Training Data
emoji: 📚
colorFrom: blue
colorTo: green
sdk: datasets
sdk_version: 2.0.0
---
# Myanmar LLM Training Data
Burmese (Myanmar) conversational dataset for training language models.
## Dataset Description
| Split | Samples |
|-------|---------|
| Train | 1000 |
| Test | 1000 |
| Validation | 1000 |
## Tags Distribution
| Tag | Count (Train) |
|-----|---------------|
| coding | 418 |
| translation | 215 |
| general | 193 |
| greeting | 174 |
## Data Format
```json
{
"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."}
],
"tags": "coding|translation|general|greeting"
}
```
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("amkyawdev/myanmar-llm-data")
print(dataset)
```
## License
MIT
---
Built for Myanmar NLP by amkyawdev
# 缅甸大语言模型(Large Language Model)训练数据集
📚 数据集配色:由蓝色渐变至绿色,所用SDK为datasets,SDK版本为2.0.0
---
## 缅甸大语言模型训练数据集
本数据集为用于训练语言模型的缅甸语对话数据集。
### 数据集划分详情
| 划分方式 | 样本数量 |
|-------|---------|
| 训练集 | 1000 |
| 测试集 | 1000 |
| 验证集 | 1000 |
### 训练集标签分布
| 标签 | 训练集样本数 |
|-----|---------------|
| 编程(coding) | 418 |
| 机器翻译(translation) | 215 |
| 通用对话(general) | 193 |
| 问候语(greeting) | 174 |
### 数据格式
json
{
"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "..."},
{"role": "assistant", "content": "..."}
],
"tags": "coding|translation|general|greeting"
}
### 使用方法
python
from datasets import load_dataset
dataset = load_dataset("amkyawdev/myanmar-llm-data")
print(dataset)
### 开源许可证
MIT许可证
---
本数据集由amkyawdev为缅甸自然语言处理(Natural Language Processing, NLP)项目开发构建。
提供机构:
amkyawdev



