yhavinga/Leesplank_NL_wikipedia_simplifications_preprocessed_chatml_format
收藏Hugging Face2025-02-21 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/yhavinga/Leesplank_NL_wikipedia_simplifications_preprocessed_chatml_format
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含四个字段:指令(instruction)、提示(prompt)、结果(result)和消息列表(messages),其中消息列表包含内容和角色两个子字段。数据集分为训练集、验证集和测试集,共计约268万个示例。训练集大小为2.7GB,验证集大小为0.8GB,测试集大小为0.4GB。提供了默认配置,指定了训练、验证和测试数据文件的路径。
The dataset includes four fields: instruction, prompt, result, and a list of messages which contains content and role as sub-fields. It is divided into training, validation, and test sets, totaling approximately 2.68 million examples. The training set is 2.7GB in size, the validation set is 0.8GB, and the test set is 0.4GB. A default configuration is provided, specifying the file paths for the training, validation, and test data.
提供机构:
yhavinga



