five

typhoon-ai/typhoon-s-sovereign-capability-dataset

收藏
Hugging Face2026-01-28 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/typhoon-ai/typhoon-s-sovereign-capability-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
Typhoon-S Instruct Post-Training数据集是一个用于泰语和英语文本生成任务的数据集,包含两个主要子数据集:NitiBench(法律领域)和MIRAGE(通用领域)。NitiBench包含多个训练和测试集文件,用于强化学习训练、预训练和监督微调。MIRAGE同样包含训练和测试集文件。数据集来源于多个公开的原始数据集,并用于Typhoon-S项目的泰语语言模型训练和评估。

The Typhoon-S Instruct Post-Training dataset is designed for text-generation tasks in Thai and English, comprising two main subsets: NitiBench (Legal Domain) and MIRAGE (General Domain). NitiBench includes various training and test set files for reinforcement learning training, pretraining, and supervised fine-tuning. MIRAGE also contains training and test set files. The dataset is sourced from multiple publicly available original datasets and is used for training and evaluating Thai language models in the Typhoon-S project.
提供机构:
typhoon-ai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作