wilstrup/fineweb-summaries
收藏Hugging Face2025-10-30 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/wilstrup/fineweb-summaries
下载链接
链接失效反馈官方服务:
资源简介:
FineWeb-Summaries数据集是一个包含教育网页文档和AI生成摘要的数据集。它由高质量的教育内容组成,摘要由Gemini 2.5 Flash Lite模型生成,每份摘要大约200字。数据集适用于训练和测试文本压缩模型,目前包含一个大的训练集。
FineWeb-Summaries is a dataset consisting of educational web documents paired with AI-generated summaries. It comprises high-quality educational content, with summaries generated by the Gemini 2.5 Flash Lite model, each around 200 words. The dataset is suitable for training and testing text compression models and currently includes a large training split.
提供机构:
wilstrup



