alexwww94/Rexverse-2M-formatted
收藏Hugging Face2026-01-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/alexwww94/Rexverse-2M-formatted
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: image_split_0
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1979569894
num_examples: 10001
download_size: 1975303797
dataset_size: 1979569894
- config_name: image_split_1
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1993788719
num_examples: 10001
download_size: 1989237236
dataset_size: 1993788719
- config_name: image_split_10
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1974631142
num_examples: 10001
download_size: 1970374393
dataset_size: 1974631142
- config_name: image_split_11
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1976771404
num_examples: 10001
download_size: 1971941693
dataset_size: 1976771404
- config_name: image_split_12
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1970268882
num_examples: 10001
download_size: 1965509693
dataset_size: 1970268882
- config_name: image_split_13
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1983502098
num_examples: 10001
download_size: 1978947850
dataset_size: 1983502098
- config_name: image_split_14
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 2003794319
num_examples: 10001
download_size: 1999684050
dataset_size: 2003794319
- config_name: image_split_15
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1970740532
num_examples: 10001
download_size: 1966562325
dataset_size: 1970740532
- config_name: image_split_16
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1979319037
num_examples: 10001
download_size: 1975235928
dataset_size: 1979319037
- config_name: image_split_17
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1991160717
num_examples: 10001
download_size: 1986933978
dataset_size: 1991160717
- config_name: image_split_18
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1986730606
num_examples: 10001
download_size: 1981932152
dataset_size: 1986730606
- config_name: image_split_19
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1981060021
num_examples: 10001
download_size: 1976158312
dataset_size: 1981060021
- config_name: image_split_2
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1973496401
num_examples: 10001
download_size: 1969052080
dataset_size: 1973496401
- config_name: image_split_20
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1968380202
num_examples: 10001
download_size: 1963845312
dataset_size: 1968380202
- config_name: image_split_21
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1982622485
num_examples: 10001
download_size: 1978105222
dataset_size: 1982622485
- config_name: image_split_22
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1965172096
num_examples: 10001
download_size: 1961201966
dataset_size: 1965172096
- config_name: image_split_23
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1977753539
num_examples: 10001
download_size: 1973384946
dataset_size: 1977753539
- config_name: image_split_24
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1974723400
num_examples: 10001
download_size: 1970725164
dataset_size: 1974723400
- config_name: image_split_25
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1994873826
num_examples: 10001
download_size: 1990734911
dataset_size: 1994873826
- config_name: image_split_26
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1974924079
num_examples: 10001
download_size: 1969986365
dataset_size: 1974924079
- config_name: image_split_27
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1967636784
num_examples: 10001
download_size: 1963076053
dataset_size: 1967636784
- config_name: image_split_28
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1971907619
num_examples: 10001
download_size: 1967393233
dataset_size: 1971907619
- config_name: image_split_29
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1962532860
num_examples: 10001
download_size: 1958110851
dataset_size: 1962532860
- config_name: image_split_3
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1965301502
num_examples: 10001
download_size: 1961093401
dataset_size: 1965301502
- config_name: image_split_30
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1998164825
num_examples: 10001
download_size: 1993577883
dataset_size: 1998164825
- config_name: image_split_31
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1976594252
num_examples: 10001
download_size: 1971986112
dataset_size: 1976594252
- config_name: image_split_32
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1993241533
num_examples: 10001
download_size: 1988710972
dataset_size: 1993241533
- config_name: image_split_33
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1960060331
num_examples: 10001
download_size: 1955582372
dataset_size: 1960060331
- config_name: image_split_34
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1977944070
num_examples: 10001
download_size: 1973511516
dataset_size: 1977944070
- config_name: image_split_35
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1962362911
num_examples: 10001
download_size: 1957993260
dataset_size: 1962362911
- config_name: image_split_36
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1965893817
num_examples: 10001
download_size: 1961240084
dataset_size: 1965893817
- config_name: image_split_37
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1980558172
num_examples: 10001
download_size: 1976420123
dataset_size: 1980558172
- config_name: image_split_38
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1971028924
num_examples: 10001
download_size: 1966754638
dataset_size: 1971028924
- config_name: image_split_39
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1982763358
num_examples: 10001
download_size: 1978483022
dataset_size: 1982763358
- config_name: image_split_4
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1956752629
num_examples: 10001
download_size: 1952314453
dataset_size: 1956752629
- config_name: image_split_40
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1985607539
num_examples: 10001
download_size: 1981298277
dataset_size: 1985607539
- config_name: image_split_5
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1991601897
num_examples: 10001
download_size: 1987102682
dataset_size: 1991601897
- config_name: image_split_6
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1985082649
num_examples: 10001
download_size: 1980876756
dataset_size: 1985082649
- config_name: image_split_7
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1978823369
num_examples: 10001
download_size: 1973858956
dataset_size: 1978823369
- config_name: image_split_8
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 2003808831
num_examples: 10001
download_size: 1998330825
dataset_size: 2003808831
- config_name: image_split_9
features:
- name: img_line_idx
dtype: int64
- name: img
dtype: image
splits:
- name: train
num_bytes: 1956205045
num_examples: 10001
download_size: 1951241403
dataset_size: 1956205045
- config_name: one_sentence_anno
features:
- name: img_line_idx
dtype: int64
- name: ann
dtype: string
splits:
- name: train
num_bytes: 1128571168
num_examples: 415857
download_size: 539855737
dataset_size: 1128571168
- config_name: one_sentence_anno_zh
features:
- name: img_line_idx
dtype: int64
- name: ann
dtype: string
splits:
- name: train
num_bytes: 1181947463
num_examples: 415857
download_size: 607511704
dataset_size: 1181947463
- config_name: referring_anno
features:
- name: img_line_idx
dtype: int64
- name: ann
dtype: string
splits:
- name: train
num_bytes: 1051561262
num_examples: 415857
download_size: 529878150
dataset_size: 1051561262
configs:
- config_name: image_split_0
data_files:
- split: train
path: image_split_0/train-*
- config_name: image_split_1
data_files:
- split: train
path: image_split_1/train-*
- config_name: image_split_10
data_files:
- split: train
path: image_split_10/train-*
- config_name: image_split_11
data_files:
- split: train
path: image_split_11/train-*
- config_name: image_split_12
data_files:
- split: train
path: image_split_12/train-*
- config_name: image_split_13
data_files:
- split: train
path: image_split_13/train-*
- config_name: image_split_14
data_files:
- split: train
path: image_split_14/train-*
- config_name: image_split_15
data_files:
- split: train
path: image_split_15/train-*
- config_name: image_split_16
data_files:
- split: train
path: image_split_16/train-*
- config_name: image_split_17
data_files:
- split: train
path: image_split_17/train-*
- config_name: image_split_18
data_files:
- split: train
path: image_split_18/train-*
- config_name: image_split_19
data_files:
- split: train
path: image_split_19/train-*
- config_name: image_split_2
data_files:
- split: train
path: image_split_2/train-*
- config_name: image_split_20
data_files:
- split: train
path: image_split_20/train-*
- config_name: image_split_21
data_files:
- split: train
path: image_split_21/train-*
- config_name: image_split_22
data_files:
- split: train
path: image_split_22/train-*
- config_name: image_split_23
data_files:
- split: train
path: image_split_23/train-*
- config_name: image_split_24
data_files:
- split: train
path: image_split_24/train-*
- config_name: image_split_25
data_files:
- split: train
path: image_split_25/train-*
- config_name: image_split_26
data_files:
- split: train
path: image_split_26/train-*
- config_name: image_split_27
data_files:
- split: train
path: image_split_27/train-*
- config_name: image_split_28
data_files:
- split: train
path: image_split_28/train-*
- config_name: image_split_29
data_files:
- split: train
path: image_split_29/train-*
- config_name: image_split_3
data_files:
- split: train
path: image_split_3/train-*
- config_name: image_split_30
data_files:
- split: train
path: image_split_30/train-*
- config_name: image_split_31
data_files:
- split: train
path: image_split_31/train-*
- config_name: image_split_32
data_files:
- split: train
path: image_split_32/train-*
- config_name: image_split_33
data_files:
- split: train
path: image_split_33/train-*
- config_name: image_split_34
data_files:
- split: train
path: image_split_34/train-*
- config_name: image_split_35
data_files:
- split: train
path: image_split_35/train-*
- config_name: image_split_36
data_files:
- split: train
path: image_split_36/train-*
- config_name: image_split_37
data_files:
- split: train
path: image_split_37/train-*
- config_name: image_split_38
data_files:
- split: train
path: image_split_38/train-*
- config_name: image_split_39
data_files:
- split: train
path: image_split_39/train-*
- config_name: image_split_4
data_files:
- split: train
path: image_split_4/train-*
- config_name: image_split_40
data_files:
- split: train
path: image_split_40/train-*
- config_name: image_split_5
data_files:
- split: train
path: image_split_5/train-*
- config_name: image_split_6
data_files:
- split: train
path: image_split_6/train-*
- config_name: image_split_7
data_files:
- split: train
path: image_split_7/train-*
- config_name: image_split_8
data_files:
- split: train
path: image_split_8/train-*
- config_name: image_split_9
data_files:
- split: train
path: image_split_9/train-*
- config_name: one_sentence_anno
data_files:
- split: train
path: one_sentence_anno/train-*
- config_name: one_sentence_anno_zh
data_files:
- split: train
path: one_sentence_anno_zh/train-*
- config_name: referring_anno
data_files:
- split: train
path: referring_anno/train-*
---
# 数据集信息
本数据集包含多组配置项,分为图像分片配置与标注配置两大类:
## 1. 图像分片配置
共计41项,配置名称依次为`image_split_0`至`image_split_40`,所有配置结构一致,仅参数数值存在差异:
### 通用结构
- **配置名称**:`image_split_X`(X为0到40的整数)
- **特征字段**:
- `img_line_idx`:64位整型(int64)
- `img`:图像类型
- **数据集划分**:仅包含训练集(train),训练集字节占用量为对应数值,样本总数固定为10001
- **下载大小**:对应数值字节
- **数据集存储总大小**:对应数值字节
### 具体参数
| 配置名称 | 训练集字节占用 | 下载大小 | 数据集存储大小 |
|----------------|----------------|----------------|----------------|
| image_split_0 | 1979569894 | 1975303797 | 1979569894 |
| image_split_1 | 1993788719 | 1989237236 | 1993788719 |
| image_split_10 | 1974631142 | 1970374393 | 1974631142 |
| image_split_11 | 1976771404 | 1971941693 | 1976771404 |
| image_split_12 | 1970268882 | 1965509693 | 1970268882 |
| image_split_13 | 1983502098 | 1978947850 | 1983502098 |
| image_split_14 | 2003794319 | 1999684050 | 2003794319 |
| image_split_15 | 1970740532 | 1966562325 | 1970740532 |
| image_split_16 | 1979319037 | 1975235928 | 1979319037 |
| image_split_17 | 1991160717 | 1986933978 | 1991160717 |
| image_split_18 | 1986730606 | 1981932152 | 1986730606 |
| image_split_19 | 1981060021 | 1976158312 | 1981060021 |
| image_split_2 | 1973496401 | 1969052080 | 1973496401 |
| image_split_20 | 1968380202 | 1963845312 | 1968380202 |
| image_split_21 | 1982622485 | 1978105222 | 1982622485 |
| image_split_22 | 1965172096 | 1961201966 | 1965172096 |
| image_split_23 | 1977753539 | 1973384946 | 1977753539 |
| image_split_24 | 1974723400 | 1970725164 | 1974723400 |
| image_split_25 | 1994873826 | 1990734911 | 1994873826 |
| image_split_26 | 1974924079 | 1969986365 | 1974924079 |
| image_split_27 | 1967636784 | 1963076053 | 1967636784 |
| image_split_28 | 1971907619 | 1967393233 | 1971907619 |
| image_split_29 | 1962532860 | 1958110851 | 1962532860 |
| image_split_3 | 1965301502 | 1961093401 | 1965301502 |
| image_split_30 | 1998164825 | 1993577883 | 1998164825 |
| image_split_31 | 1976594252 | 1971986112 | 1976594252 |
| image_split_32 | 1993241533 | 1988710972 | 1993241533 |
| image_split_33 | 1960060331 | 1955582372 | 1960060331 |
| image_split_34 | 1977944070 | 1973511516 | 1977944070 |
| image_split_35 | 1962362911 | 1957993260 | 1962362911 |
| image_split_36 | 1965893817 | 1961240084 | 1965893817 |
| image_split_37 | 1980558172 | 1976420123 | 1980558172 |
| image_split_38 | 1971028924 | 1966754638 | 1971028924 |
| image_split_39 | 1982763358 | 1978483022 | 1982763358 |
| image_split_4 | 1956752629 | 1952314453 | 1956752629 |
| image_split_40 | 1985607539 | 1981298277 | 1985607539 |
| image_split_5 | 1991601897 | 1987102682 | 1991601897 |
| image_split_6 | 1985082649 | 1980876756 | 1985082649 |
| image_split_7 | 1978823369 | 1973858956 | 1978823369 |
| image_split_8 | 2003808831 | 1998330825 | 2003808831 |
| image_split_9 | 1956205045 | 1951241403 | 1956205045 |
## 2. 标注配置
共计3项,具体信息如下:
### 2.1 单句标注配置(one_sentence_anno)
- 特征字段:`img_line_idx`(64位整型)、`ann`(字符串类型)
- 训练集:字节占用1128571168,样本总数415857
- 下载大小:539855737字节,数据集存储总大小:1128571168字节
### 2.2 中文单句标注配置(one_sentence_anno_zh)
- 特征字段:`img_line_idx`(64位整型)、`ann`(字符串类型)
- 训练集:字节占用1181947463,样本总数415857
- 下载大小:607511704字节,数据集存储总大小:1181947463字节
### 2.3 指代标注配置(referring_anno)
- 特征字段:`img_line_idx`(64位整型)、`ann`(字符串类型)
- 训练集:字节占用1051561262,样本总数415857
- 下载大小:529878150字节,数据集存储总大小:1051561262字节
## 3. 数据文件路径
所有配置项的训练集数据文件路径均遵循格式:`{配置名称}/train-*`。
提供机构:
alexwww94



