five

vanta-research/spontaneous-observations

收藏
Hugging Face2026-01-18 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/vanta-research/spontaneous-observations
下载链接
链接失效反馈
官方服务:
资源简介:
自发观察数据集是一个精选的对话示例集,包含1,429个展示自然、有机观察和深思熟虑参与的例子。该数据集旨在微调语言模型,使其产生真实、自发的响应,而非公式化或过度迎合的输出。数据集具有以下关键特征:自然的对话语调、真实的参与度、适当的反驳、深思熟虑的深度以及混合领域(技术、哲学和日常话题)覆盖。数据集采用JSONL格式,包含用户消息和助理响应的对话对。创建过程包括四个步骤:种子生成(由Claude Opus 4.5创建初始示例)、数据集扩展(由Mistral Large 3扩展至最终规模)、质量过滤(由DeepSeek V3.1进行评分评估)和人工审查(所有示例最终经过人工批准)。

The Spontaneous Observations dataset is a curated collection of 1,429 conversational examples demonstrating natural, organic observations and thoughtful engagement. Designed for fine-tuning language models to produce genuine, spontaneous responses rather than formulaic or overly accommodating outputs. Key characteristics include: natural conversational tone, genuine engagement, appropriate pushback, thoughtful depth, and mixed domain coverage (technical, philosophical, and everyday topics). The dataset is in JSONL format containing user-assistant message pairs. Creation process involves four steps: seed generation (initial examples by Claude Opus 4.5), dataset expansion (to final size by Mistral Large 3), quality filtering (scored assessment by DeepSeek V3.1), and human review (final approval on all examples).
提供机构:
vanta-research
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作