five

USS-Inferprise/Slopasaurus-Training-Slop

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/USS-Inferprise/Slopasaurus-Training-Slop
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含3059个短篇故事的文本样本,这些故事被设计为故意充满2026种风格的AI Slop(质量低劣的内容)。这些数据富含高质量的低劣内容,即语法正确、读起来像优美散文,但缺乏任何文学价值的内容。这与破碎的低劣内容不同,后者是由过度量化或从根本上失败的模型生成的低劣内容。数据集旨在用于创意写作LLM基准测试,但也适用于低劣内容的一般研究、低劣内容检测系统、基准测试和去低劣化。数据生成方法包括创建提示以鼓励低劣内容生成,使用自定义的Mistral模型合并进行推理,以及通过Phi模型反向工程生成每个低劣内容样本的提示。每个JSON对象包含反向工程的合理指令、原始高质量低劣内容、原始叙事变量和风格约束,以及元数据。

This dataset comprises 3059 textual samples of short stories that have been engineered to be deliberately dense with 2026 style AI Slop. The data is rich in quality slop, which is slop that is grammatically correct and reads like fine prose, but lacks any literary merit. This is distinct from broken slop, which is the kind of slop generated by models that are over quantized or have fundamentally failed. The dataset is intended for Creative Writing LLM Benchmarking but is also relevant to general research into slop, slop detection systems, benchmarking, and deslopping. The methodology involves creating prompts to encourage slop generation, using a custom merge of the Mistral model for inferencing, and reverse engineering prompts for each slop sample via a Phi model. Each JSON object contains the reverse-engineered sensible instruction, the raw Quality Slop, the original narrative variables and style constraints, and metadata.
提供机构:
USS-Inferprise
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作