five

alexwww94/Rexverse-2M-formatted

收藏
Hugging Face2026-01-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/alexwww94/Rexverse-2M-formatted
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: image_split_0 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1979569894 num_examples: 10001 download_size: 1975303797 dataset_size: 1979569894 - config_name: image_split_1 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1993788719 num_examples: 10001 download_size: 1989237236 dataset_size: 1993788719 - config_name: image_split_10 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1974631142 num_examples: 10001 download_size: 1970374393 dataset_size: 1974631142 - config_name: image_split_11 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1976771404 num_examples: 10001 download_size: 1971941693 dataset_size: 1976771404 - config_name: image_split_12 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1970268882 num_examples: 10001 download_size: 1965509693 dataset_size: 1970268882 - config_name: image_split_13 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1983502098 num_examples: 10001 download_size: 1978947850 dataset_size: 1983502098 - config_name: image_split_14 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 2003794319 num_examples: 10001 download_size: 1999684050 dataset_size: 2003794319 - config_name: image_split_15 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1970740532 num_examples: 10001 download_size: 1966562325 dataset_size: 1970740532 - config_name: image_split_16 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1979319037 num_examples: 10001 download_size: 1975235928 dataset_size: 1979319037 - config_name: image_split_17 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1991160717 num_examples: 10001 download_size: 1986933978 dataset_size: 1991160717 - config_name: image_split_18 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1986730606 num_examples: 10001 download_size: 1981932152 dataset_size: 1986730606 - config_name: image_split_19 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1981060021 num_examples: 10001 download_size: 1976158312 dataset_size: 1981060021 - config_name: image_split_2 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1973496401 num_examples: 10001 download_size: 1969052080 dataset_size: 1973496401 - config_name: image_split_20 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1968380202 num_examples: 10001 download_size: 1963845312 dataset_size: 1968380202 - config_name: image_split_21 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1982622485 num_examples: 10001 download_size: 1978105222 dataset_size: 1982622485 - config_name: image_split_22 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1965172096 num_examples: 10001 download_size: 1961201966 dataset_size: 1965172096 - config_name: image_split_23 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1977753539 num_examples: 10001 download_size: 1973384946 dataset_size: 1977753539 - config_name: image_split_24 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1974723400 num_examples: 10001 download_size: 1970725164 dataset_size: 1974723400 - config_name: image_split_25 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1994873826 num_examples: 10001 download_size: 1990734911 dataset_size: 1994873826 - config_name: image_split_26 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1974924079 num_examples: 10001 download_size: 1969986365 dataset_size: 1974924079 - config_name: image_split_27 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1967636784 num_examples: 10001 download_size: 1963076053 dataset_size: 1967636784 - config_name: image_split_28 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1971907619 num_examples: 10001 download_size: 1967393233 dataset_size: 1971907619 - config_name: image_split_29 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1962532860 num_examples: 10001 download_size: 1958110851 dataset_size: 1962532860 - config_name: image_split_3 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1965301502 num_examples: 10001 download_size: 1961093401 dataset_size: 1965301502 - config_name: image_split_30 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1998164825 num_examples: 10001 download_size: 1993577883 dataset_size: 1998164825 - config_name: image_split_31 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1976594252 num_examples: 10001 download_size: 1971986112 dataset_size: 1976594252 - config_name: image_split_32 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1993241533 num_examples: 10001 download_size: 1988710972 dataset_size: 1993241533 - config_name: image_split_33 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1960060331 num_examples: 10001 download_size: 1955582372 dataset_size: 1960060331 - config_name: image_split_34 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1977944070 num_examples: 10001 download_size: 1973511516 dataset_size: 1977944070 - config_name: image_split_35 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1962362911 num_examples: 10001 download_size: 1957993260 dataset_size: 1962362911 - config_name: image_split_36 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1965893817 num_examples: 10001 download_size: 1961240084 dataset_size: 1965893817 - config_name: image_split_37 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1980558172 num_examples: 10001 download_size: 1976420123 dataset_size: 1980558172 - config_name: image_split_38 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1971028924 num_examples: 10001 download_size: 1966754638 dataset_size: 1971028924 - config_name: image_split_39 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1982763358 num_examples: 10001 download_size: 1978483022 dataset_size: 1982763358 - config_name: image_split_4 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1956752629 num_examples: 10001 download_size: 1952314453 dataset_size: 1956752629 - config_name: image_split_40 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1985607539 num_examples: 10001 download_size: 1981298277 dataset_size: 1985607539 - config_name: image_split_5 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1991601897 num_examples: 10001 download_size: 1987102682 dataset_size: 1991601897 - config_name: image_split_6 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1985082649 num_examples: 10001 download_size: 1980876756 dataset_size: 1985082649 - config_name: image_split_7 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1978823369 num_examples: 10001 download_size: 1973858956 dataset_size: 1978823369 - config_name: image_split_8 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 2003808831 num_examples: 10001 download_size: 1998330825 dataset_size: 2003808831 - config_name: image_split_9 features: - name: img_line_idx dtype: int64 - name: img dtype: image splits: - name: train num_bytes: 1956205045 num_examples: 10001 download_size: 1951241403 dataset_size: 1956205045 - config_name: one_sentence_anno features: - name: img_line_idx dtype: int64 - name: ann dtype: string splits: - name: train num_bytes: 1128571168 num_examples: 415857 download_size: 539855737 dataset_size: 1128571168 - config_name: one_sentence_anno_zh features: - name: img_line_idx dtype: int64 - name: ann dtype: string splits: - name: train num_bytes: 1181947463 num_examples: 415857 download_size: 607511704 dataset_size: 1181947463 - config_name: referring_anno features: - name: img_line_idx dtype: int64 - name: ann dtype: string splits: - name: train num_bytes: 1051561262 num_examples: 415857 download_size: 529878150 dataset_size: 1051561262 configs: - config_name: image_split_0 data_files: - split: train path: image_split_0/train-* - config_name: image_split_1 data_files: - split: train path: image_split_1/train-* - config_name: image_split_10 data_files: - split: train path: image_split_10/train-* - config_name: image_split_11 data_files: - split: train path: image_split_11/train-* - config_name: image_split_12 data_files: - split: train path: image_split_12/train-* - config_name: image_split_13 data_files: - split: train path: image_split_13/train-* - config_name: image_split_14 data_files: - split: train path: image_split_14/train-* - config_name: image_split_15 data_files: - split: train path: image_split_15/train-* - config_name: image_split_16 data_files: - split: train path: image_split_16/train-* - config_name: image_split_17 data_files: - split: train path: image_split_17/train-* - config_name: image_split_18 data_files: - split: train path: image_split_18/train-* - config_name: image_split_19 data_files: - split: train path: image_split_19/train-* - config_name: image_split_2 data_files: - split: train path: image_split_2/train-* - config_name: image_split_20 data_files: - split: train path: image_split_20/train-* - config_name: image_split_21 data_files: - split: train path: image_split_21/train-* - config_name: image_split_22 data_files: - split: train path: image_split_22/train-* - config_name: image_split_23 data_files: - split: train path: image_split_23/train-* - config_name: image_split_24 data_files: - split: train path: image_split_24/train-* - config_name: image_split_25 data_files: - split: train path: image_split_25/train-* - config_name: image_split_26 data_files: - split: train path: image_split_26/train-* - config_name: image_split_27 data_files: - split: train path: image_split_27/train-* - config_name: image_split_28 data_files: - split: train path: image_split_28/train-* - config_name: image_split_29 data_files: - split: train path: image_split_29/train-* - config_name: image_split_3 data_files: - split: train path: image_split_3/train-* - config_name: image_split_30 data_files: - split: train path: image_split_30/train-* - config_name: image_split_31 data_files: - split: train path: image_split_31/train-* - config_name: image_split_32 data_files: - split: train path: image_split_32/train-* - config_name: image_split_33 data_files: - split: train path: image_split_33/train-* - config_name: image_split_34 data_files: - split: train path: image_split_34/train-* - config_name: image_split_35 data_files: - split: train path: image_split_35/train-* - config_name: image_split_36 data_files: - split: train path: image_split_36/train-* - config_name: image_split_37 data_files: - split: train path: image_split_37/train-* - config_name: image_split_38 data_files: - split: train path: image_split_38/train-* - config_name: image_split_39 data_files: - split: train path: image_split_39/train-* - config_name: image_split_4 data_files: - split: train path: image_split_4/train-* - config_name: image_split_40 data_files: - split: train path: image_split_40/train-* - config_name: image_split_5 data_files: - split: train path: image_split_5/train-* - config_name: image_split_6 data_files: - split: train path: image_split_6/train-* - config_name: image_split_7 data_files: - split: train path: image_split_7/train-* - config_name: image_split_8 data_files: - split: train path: image_split_8/train-* - config_name: image_split_9 data_files: - split: train path: image_split_9/train-* - config_name: one_sentence_anno data_files: - split: train path: one_sentence_anno/train-* - config_name: one_sentence_anno_zh data_files: - split: train path: one_sentence_anno_zh/train-* - config_name: referring_anno data_files: - split: train path: referring_anno/train-* ---

# 数据集信息 本数据集包含多组配置项,分为图像分片配置与标注配置两大类: ## 1. 图像分片配置 共计41项,配置名称依次为`image_split_0`至`image_split_40`,所有配置结构一致,仅参数数值存在差异: ### 通用结构 - **配置名称**:`image_split_X`(X为0到40的整数) - **特征字段**: - `img_line_idx`:64位整型(int64) - `img`:图像类型 - **数据集划分**:仅包含训练集(train),训练集字节占用量为对应数值,样本总数固定为10001 - **下载大小**:对应数值字节 - **数据集存储总大小**:对应数值字节 ### 具体参数 | 配置名称 | 训练集字节占用 | 下载大小 | 数据集存储大小 | |----------------|----------------|----------------|----------------| | image_split_0 | 1979569894 | 1975303797 | 1979569894 | | image_split_1 | 1993788719 | 1989237236 | 1993788719 | | image_split_10 | 1974631142 | 1970374393 | 1974631142 | | image_split_11 | 1976771404 | 1971941693 | 1976771404 | | image_split_12 | 1970268882 | 1965509693 | 1970268882 | | image_split_13 | 1983502098 | 1978947850 | 1983502098 | | image_split_14 | 2003794319 | 1999684050 | 2003794319 | | image_split_15 | 1970740532 | 1966562325 | 1970740532 | | image_split_16 | 1979319037 | 1975235928 | 1979319037 | | image_split_17 | 1991160717 | 1986933978 | 1991160717 | | image_split_18 | 1986730606 | 1981932152 | 1986730606 | | image_split_19 | 1981060021 | 1976158312 | 1981060021 | | image_split_2 | 1973496401 | 1969052080 | 1973496401 | | image_split_20 | 1968380202 | 1963845312 | 1968380202 | | image_split_21 | 1982622485 | 1978105222 | 1982622485 | | image_split_22 | 1965172096 | 1961201966 | 1965172096 | | image_split_23 | 1977753539 | 1973384946 | 1977753539 | | image_split_24 | 1974723400 | 1970725164 | 1974723400 | | image_split_25 | 1994873826 | 1990734911 | 1994873826 | | image_split_26 | 1974924079 | 1969986365 | 1974924079 | | image_split_27 | 1967636784 | 1963076053 | 1967636784 | | image_split_28 | 1971907619 | 1967393233 | 1971907619 | | image_split_29 | 1962532860 | 1958110851 | 1962532860 | | image_split_3 | 1965301502 | 1961093401 | 1965301502 | | image_split_30 | 1998164825 | 1993577883 | 1998164825 | | image_split_31 | 1976594252 | 1971986112 | 1976594252 | | image_split_32 | 1993241533 | 1988710972 | 1993241533 | | image_split_33 | 1960060331 | 1955582372 | 1960060331 | | image_split_34 | 1977944070 | 1973511516 | 1977944070 | | image_split_35 | 1962362911 | 1957993260 | 1962362911 | | image_split_36 | 1965893817 | 1961240084 | 1965893817 | | image_split_37 | 1980558172 | 1976420123 | 1980558172 | | image_split_38 | 1971028924 | 1966754638 | 1971028924 | | image_split_39 | 1982763358 | 1978483022 | 1982763358 | | image_split_4 | 1956752629 | 1952314453 | 1956752629 | | image_split_40 | 1985607539 | 1981298277 | 1985607539 | | image_split_5 | 1991601897 | 1987102682 | 1991601897 | | image_split_6 | 1985082649 | 1980876756 | 1985082649 | | image_split_7 | 1978823369 | 1973858956 | 1978823369 | | image_split_8 | 2003808831 | 1998330825 | 2003808831 | | image_split_9 | 1956205045 | 1951241403 | 1956205045 | ## 2. 标注配置 共计3项,具体信息如下: ### 2.1 单句标注配置(one_sentence_anno) - 特征字段:`img_line_idx`(64位整型)、`ann`(字符串类型) - 训练集:字节占用1128571168,样本总数415857 - 下载大小:539855737字节,数据集存储总大小:1128571168字节 ### 2.2 中文单句标注配置(one_sentence_anno_zh) - 特征字段:`img_line_idx`(64位整型)、`ann`(字符串类型) - 训练集:字节占用1181947463,样本总数415857 - 下载大小:607511704字节,数据集存储总大小:1181947463字节 ### 2.3 指代标注配置(referring_anno) - 特征字段:`img_line_idx`(64位整型)、`ann`(字符串类型) - 训练集:字节占用1051561262,样本总数415857 - 下载大小:529878150字节,数据集存储总大小:1051561262字节 ## 3. 数据文件路径 所有配置项的训练集数据文件路径均遵循格式:`{配置名称}/train-*`。
提供机构:
alexwww94
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作