five

OpenMed/synthvision-validated-qwen-by-kimi

收藏
Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/OpenMed/synthvision-validated-qwen-by-kimi
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - visual-question-answering tags: - medical - synthvision - openmed size_categories: - 10K<n<100K --- # synthvision-validated-qwen-by-kimi ![SynthVision](synthvision_featured.png) Qwen 3.5 annotations validated by Kimi K2.5 (93.1% pass rate) **Records**: 55,359 ## About Cross-validated subset from the [SynthVision pipeline](https://huggingface.co/blog/OpenMed/synthvision). Kimi K2.5 reviewed all 59,476 Qwen 3.5 annotations and confirmed 55,359 as consistent with the source images (93.1% pass rate). Validation criteria: `consistent == true` AND `confidence >= 0.7`. Records that failed validation were removed — primarily cases where the annotator hallucinated findings not visible in the image. ## Schema ``` id: str # unique record ID image: str # relative image path conversations: list[dict] # multi-turn ShareGPT format report: str # clinical narrative structured_findings: dict # finding_name → value validation: dict # {consistent, confidence, reason} quality_score: float # composite quality score ``` ## Loading ```python from datasets import load_dataset ds = load_dataset("OpenMed/synthvision-validated-qwen-by-kimi") ``` ## Links - [SynthVision blog post](https://huggingface.co/blog/OpenMed/synthvision) - [Source code](https://github.com/openmed-labs/synthvision) - [All SynthVision artifacts](https://huggingface.co/collections/OpenMed/synthvision-69baac655b557943aa1babd3) - [OpenMed on Hugging Face](https://huggingface.co/OpenMed)

--- 许可证: Apache-2.0 任务类别: - 视觉问答(visual-question-answering) 标签: - 医学 - SynthVision - OpenMed 数据量级: - 10000条 < 样本数 < 100000条 --- # 经Kimi K2.5校验的SynthVision-Qwen数据集 ![SynthVision特征展示图](synthvision_featured.png) Qwen 3.5标注结果经Kimi K2.5校验,整体校验通过率为93.1% **总记录数**: 55359条 ## 数据集说明 本数据集为[SynthVision流水线](https://huggingface.co/blog/OpenMed/synthvision)的交叉校验子集。Kimi K2.5对全部59476条Qwen 3.5标注结果进行了审核,确认其中55359条与源图像内容一致,校验通过率达93.1%。 校验规则为:`consistent == true` 且 `confidence >= 0.7`。未通过校验的记录已被全部移除,此类未通过记录主要为标注者虚构了图像中不存在的医学发现的案例。 ## 数据结构 id: str # 唯一记录标识符 image: str # 图像相对路径 conversations: list[dict] # 多轮ShareGPT格式对话 report: str # 临床叙述文本 structured_findings: dict # 以「异常名称→对应取值」形式组织的结构化医学发现 validation: dict # 包含consistent、confidence、reason字段的校验结果字典 quality_score: float # 综合质量评分 ## 数据集加载 python from datasets import load_dataset ds = load_dataset("OpenMed/synthvision-validated-qwen-by-kimi") ## 相关链接 - [SynthVision官方博客文章](https://huggingface.co/blog/OpenMed/synthvision) - [SynthVision源代码仓库](https://github.com/openmed-labs/synthvision) - [SynthVision全系列相关产物](https://huggingface.co/collections/OpenMed/synthvision-69baac655b557943aa1babd3) - [Hugging Face平台OpenMed官方主页](https://huggingface.co/OpenMed)
提供机构:
OpenMed
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作