five

toksuite/toksuite_pretraining_data

收藏
Hugging Face2026-04-05 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/toksuite/toksuite_pretraining_data
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: fas_Arab data_files: - split: train path: "fas_Arab/*.chunk.*.jsonl" - split: validation path: "fas_Arab/*.val.jsonl" - config_name: ita_Latn data_files: - split: train path: "ita_Latn/*.chunk.*.jsonl" - split: validation path: "ita_Latn/*.val.jsonl" - config_name: tur_Latn data_files: - split: train path: "tur_Latn/*.chunk.*.jsonl" - split: validation path: "tur_Latn/*.val.jsonl" - config_name: cmn_Hani data_files: - split: train path: "cmn_Hani/*.chunk.*.jsonl" - split: validation path: "cmn_Hani/*.val.jsonl" - config_name: fw_edu data_files: - split: train path: "fw_edu/*chunk.*.jsonl" ---
提供机构:
toksuite
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作