Ghaser/Wikipedia-Knowledge-2M

Name: Ghaser/Wikipedia-Knowledge-2M
Creator: Ghaser
Published: 2024-07-05 09:39:15
License: 暂无描述

Hugging Face2024-07-05 更新2024-07-06 收录

下载链接：

https://hf-mirror.com/datasets/Ghaser/Wikipedia-Knowledge-2M

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含2019163个样本，每个样本对应一个图像。数据集的JSON文件中，每个字典包含id、image和conversations三个键。id是数据的唯一标识符，image存储对应图像的名称，conversations是一个包含两个字典的列表，分别存储用户输入和模型输出。平均答案长度为84，最大答案长度为5851。

The Wikipedia-Knowledge-2M dataset contains 2,019,163 samples and images, with an average answer length of 84 and a maximum answer length of 5851. Each entry in the JSON file contains three keys: id, image, and conversations. The id is the unique identifier for the data, image stores the image name, and conversations contains a list of dictionaries with user input and model output. The dataset is designed to support research in multimodal comprehension.

提供机构：

Ghaser

5,000+

优质数据集

54 个

任务类型

进入经典数据集