OpenViDial

Name: OpenViDial
Creator: 浙江大学
Published: 2021-05-29 21:15:07
License: 暂无描述

arXiv2021-05-29 更新2024-06-21 收录

下载链接：

https://github.com/ShannonAI/OpenViDial

下载链接

链接失效反馈

官方服务：

资源简介：

OpenViDial是由浙江大学等机构创建的大规模开放领域对话数据集，包含110万条对话及其对应的视觉上下文。数据集内容来源于电影和电视剧，每条对话都与相应的视觉场景图像配对。创建过程中，使用了光学字符识别技术从视频图像中自动提取对话文本。OpenViDial主要应用于开发能够理解和生成视觉上下文相关对话的模型，旨在提升对话系统的自然度和交互质量。

OpenViDial is a large-scale open-domain dialogue dataset developed by Zhejiang University and other institutions, which contains 1.1 million dialogues and their corresponding visual contexts. The dataset's content is sourced from movies and TV dramas, with each dialogue paired with its matching visual scene images. During the dataset construction phase, optical character recognition (OCR) technology was employed to automatically extract dialogue texts from video images. OpenViDial is primarily utilized for developing models that can comprehend and generate dialogues tied to visual contexts, aiming to improve the naturalness and interactive quality of dialogue systems.

提供机构：

浙江大学

创建时间：

2020-12-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集