turing-motors/Wikipedia-Vision-JA
收藏Hugging Face2024-08-21 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/turing-motors/Wikipedia-Vision-JA
下载链接
链接失效反馈官方服务:
资源简介:
Wikipedia-Vision-JA是一个从日本维基百科生成的视觉语言模型数据集,包含160万对图像、标题和描述。数据集本身不包含原始图像数据,而是提供每个项目的`image_url`。数据格式为JSONL,包含唯一JSON ID、图像标题、文章描述、文章URL、图像URL和图像哈希值。
The **Wikipedia-Vision-JA** is a Vision Language Model dataset generated from Japanese Wikipedia, containing 1.6M pairs of images, captions, and descriptions. This dataset itself does not contain raw image data. Instead, an `image_url` is provided for each item. The dataset is formatted in JSONL, with keys including `key`, `caption`, `description`, `article_url`, `image_url`, and `image_hash`. The dataset is licensed under the CC-BY-SA 4.0 License, inheriting from Wikipedia.
提供机构:
turing-motors



