Hemabhushan/bridge_network_open_clip
收藏Hugging Face2024-07-12 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/Hemabhushan/bridge_network_open_clip
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含四个不同的配置(first_quarter, second_quarter, third_quarter, fourth_quarter),每个配置包含相同的特征,如标识符、哈希标识符、URL链接、场景开始时间、场景结束时间、文本描述、字数统计和open_clip_embeddings。每个配置的数据集大小、下载大小和示例数量也进行了详细说明。
The dataset contains four different configurations (first_quarter, second_quarter, third_quarter, fourth_quarter), each with the same features such as identifier, hash identifier, URL link, scene start time, scene end time, text description, word count, and open_clip_embeddings. The dataset size, download size, and number of examples for each configuration are also detailed.
提供机构:
Hemabhushan
原始信息汇总
数据集概述
数据集配置
第一季度 (first_quarter)
- 特征:
- identifier: string
- hash_identifier: string
- url_link: string
- scene_start_time: string
- scene_end_time: string
- text_description: string
- word_count: int64
- open_clip_embeddings: sequence[sequence[sequence[float32]]]
- 分割:
- train:
- 字节数: 79266558076
- 样本数: 250748
- train:
- 下载大小: 79349008293
- 数据集大小: 79266558076
- 数据文件路径: first_quarter/train-*
第四季度 (fourth_quarter)
- 特征:
- identifier: string
- hash_identifier: string
- url_link: string
- scene_start_time: string
- scene_end_time: string
- text_description: string
- word_count: int64
- open_clip_embeddings: sequence[sequence[sequence[float32]]]
- 分割:
- train:
- 字节数: 79266565676
- 样本数: 250748
- train:
- 下载大小: 79349062111
- 数据集大小: 79266565676
- 数据文件路径: fourth_quarter/train-*
第二季度 (second_quarter)
- 特征:
- identifier: string
- hash_identifier: string
- url_link: string
- scene_start_time: string
- scene_end_time: string
- text_description: string
- word_count: int64
- open_clip_embeddings: sequence[sequence[sequence[float32]]]
- 分割:
- train:
- 字节数: 79266565280
- 样本数: 250748
- train:
- 下载大小: 79348941140
- 数据集大小: 79266565280
- 数据文件路径: second_quarter/train-*
第三季度 (third_quarter)
- 特征:
- identifier: string
- hash_identifier: string
- url_link: string
- scene_start_time: string
- scene_end_time: string
- text_description: string
- word_count: int64
- open_clip_embeddings: sequence[sequence[sequence[float32]]]
- 分割:
- train:
- 字节数: 79266610050
- 样本数: 250748
- train:
- 下载大小: 79349059667
- 数据集大小: 79266610050
- 数据文件路径: third_quarter/train-*



