five

danielrosehill/Prompt-Separation

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/danielrosehill/Prompt-Separation
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集名为Prompt-Separation,主要用于文本分类和文本生成任务。它包含语音输入的播客提示转录,这些转录被分解为结构化的字段:离散的提示(asks)、上下文块列表和自由形式的主持人笔记。数据集支持训练一个小型模型,以从单一语音输入的消息中恢复结构化字段,从而区分用户的实际问题、周围上下文和回答的 shaping 方式。数据来源是My Weird Prompts播客制作流程,每个条目代表一集的原始用户消息。数据集还区分了人类标注和AI推断的标签,并提供了详细的数据模式(schema)和标注方法。

The dataset is named Prompt-Separation and is primarily used for text classification and text generation tasks. It contains voice-typed podcast prompt transcripts decomposed into structured fields: discrete prompts (asks), a list of context chunks, and free-form host notes. The dataset supports training a small model that, given a single voice-typed message, recovers the structured fields an AI host would consume — separating what is the user actually asking? from what is the surrounding context? from how should the response be shaped?. The data comes from the My Weird Prompts podcast production pipeline, with each row representing one episodes raw user message. The dataset also distinguishes between human-annotated and AI-extrapolated labels and provides detailed schema and labeling methodology.
提供机构:
danielrosehill
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作