StorySalon

Name: StorySalon
Creator: HaoningWu
Published: 2026-05-24 06:30:55
License: 暂无描述

OpenDataLab2026-05-24 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/HaoningWu/StorySalon

下载链接

链接失效反馈

官方服务：

资源简介：

为了解决开放式视觉故事生成的数据短缺问题，我们从多个数据来源（线上视频和6个开源电子图书馆）中收集了大量文本-图像样本对序列（paired image-text sequences），并建立了一套完善的数据处理流水线，构建了一个具有多种多样人物、故事情节和风格的大规模数据集，命名为StorySalon。多样的数据源：我们从视频（提供下载URLs）和开源电子书（遵循CC-BY 4.0许可证）中搜集了包含丰富人物、故事情节和艺术风格的视觉故事。数据处理流水线：我们构建了包括视觉帧提取、重复帧筛除、异常帧检测、视觉-语言对齐、视觉描述文本生成、文字检测和后处理等多个步骤的完善的数据处理流水线，将元数据处理为适合模型训练的形式。随着元数据的扩充，该流水线可以很容易地完成迁移，进而进一步扩充StorySalon数据集的规模。数据集优势：相较于以往仅包含不到10个角色且词汇量和故事长度有限的数据集，我们的StorySalon数据集具有规模更大的词汇表，包含数百个类别的数千个角色，因而更适合开放式任务。

To address the data scarcity issue in open-ended visual story generation, we collected a large number of paired image-text sequences from multiple sources (online videos and 6 open-source electronic libraries), established a comprehensive data processing pipeline, and constructed a large-scale dataset named StorySalon with diverse characters, story plots and artistic styles. Diverse Data Sources: We collected visual stories rich in characters, plotlines and artistic styles from videos (with downloadable URLs provided) and open-source e-books (released under CC-BY 4.0 license). Data Processing Pipeline: We built a comprehensive data processing pipeline covering multiple steps including visual frame extraction, duplicate frame filtering, abnormal frame detection, vision-language alignment, visual caption generation, text detection and post-processing, to convert raw metadata into a format suitable for model training. The pipeline can be easily transferred as metadata expands, further scaling up the size of the StorySalon dataset. Dataset Advantages: Compared with previous datasets that only contain fewer than 10 characters with limited vocabulary and story length, our StorySalon dataset has a much larger vocabulary and includes thousands of characters across hundreds of categories, making it more suitable for open-ended tasks.

提供机构：

HaoningWu

创建时间：

2024-03-11

搜集汇总

数据集介绍