PleIAs/gpt-oss20b-samples-dedup
收藏Hugging Face2025-08-09 更新2025-09-13 收录
下载链接:
https://hf-mirror.com/datasets/PleIAs/gpt-oss20b-samples-dedup
下载链接
链接失效反馈官方服务:
资源简介:
这是一个简化的去重数据集,包含了首尾各十个单词的唯一组合。数据集来源于https://huggingface.co/datasets/jxm/gpt-oss20b-samples,旨在减少合成数据的可预测性。每个唯一组合的出现次数记录在occurrence_count列中。
A simplified deduplicated dataset containing unique combinations of the first and last ten words. It is derived from https://huggingface.co/datasets/jxm/gpt-oss20b-samples to reduce the predictability of synthetic data. The occurrence count of each unique combination is recorded in the occurrence_count column.
提供机构:
PleIAs



