Eric-Valyu/Test-Prompt

Name: Eric-Valyu/Test-Prompt
Creator: Eric-Valyu
Published: 2024-07-26 12:45:02
License: 暂无描述

Hugging Face2024-07-26 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/Eric-Valyu/Test-Prompt

下载链接

链接失效反馈

官方服务：

资源简介：

MAP-CC是一个开源的中文预训练数据集，包含8000亿个标记，旨在为自然语言处理（NLP）社区提供高质量的中文预训练数据。该数据集仅用于学术研究，采用严格的合规检查以确保数据的完整性和合规性。数据集的使用受限于Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0)，允许非商业用途的共享，但禁止修改和衍生作品。

MAP-CC is an open-source Chinese pretraining dataset with a scale of 800 billion tokens, offering the NLP community high-quality Chinese pretraining data. This dataset is intended solely for scholarly research, employing rigorously compliance-checked training data to uphold the highest standards of integrity and compliance. The use of the dataset is governed by the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0), which permits sharing for non-commercial purposes only, with no modifications or derivatives allowed.

提供机构：

Eric-Valyu

5,000+

优质数据集

54 个

任务类型

进入经典数据集