five

Fredithefish/Nemotron-CC-HQ-20B

收藏
Hugging Face2026-03-25 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Fredithefish/Nemotron-CC-HQ-20B
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - text-generation pretty_name: Coo --- # Nemotron-CC-HQ-20B This Dataset consists of approximately 20B tokens of Nemotron-CC-HQ, consisting of randomly sampled slices from crawls in the range `CC-MAIN-2013-20-part-00012` to `CC-MAIN-2019-04-part-00007`. For more information about Nemotron-CC check the [Paper by Nvidia](https://arxiv.org/abs/2412.02595) <footer style="margin-top: 40px; padding-top: 10px; border-top: 1px solid #ccc; font-size: 0.9em; color: #666;"> <p> <strong>Disclaimer:</strong> Derived from Nemotron-CC (Common Crawl). No ownership of underlying content is claimed. Data may be subject to third-party rights. Use at your own risk and in compliance with applicable laws and Common Crawl Terms of Use. </p> </footer>
提供机构:
Fredithefish
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作