five

LSX-UniWue/LLaMmlein-Dataset-wo-HPLT

收藏
Hugging Face2025-12-04 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/LSX-UniWue/LLaMmlein-Dataset-wo-HPLT
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - text-generation language: - de --- This dataset is a strict subset of the [LLaMmlein-Dataset](https://huggingface.co/datasets/LSX-UniWue/LLaMmlein-Dataset) which in turn is a strict subset of the [RedPajama V2 dataset](https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2). Therefore, it retains all licenses from RedPajama V2. This dataset has been deduplicated against the deu_Latn shard of the [HPLT3.0 dataset](https://huggingface.co/datasets/HPLT/HPLT3.0). More details in our [preprint](https://arxiv.org/abs/2411.11171)! [Data Take Down](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/)
提供机构:
LSX-UniWue
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作