bdpc/rvl_cdip_n_mp
收藏Hugging Face2023-11-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/bdpc/rvl_cdip_n_mp
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
dataset_info:
features:
- name: id
dtype: string
- name: file
dtype: binary
- name: labels
dtype:
class_label:
names:
'0': letter
'1': form
'2': email
'3': handwritten
'4': advertisement
'5': scientific report
'6': scientific publication
'7': specification
'8': file folder
'9': news article
'10': budget
'11': invoice
'12': presentation
'13': questionnaire
'14': resume
'15': memo
splits:
- name: test
num_bytes: 1349159996
num_examples: 991
download_size: 0
dataset_size: 1349159996
---
# Dataset Card for RVL-CDIP-N_MultiPage
## Extension
The data loader provides support for loading RVL_CDIP-N in its extended multipage format.
Big kudos to the original authors (first in CITATION) for collecting the RVL-CDIP-N dataset.
We stand on the shoulders of giants :)
## Required installation
```bash
pip3 install pypdf2 pdf2image
sudo apt-get install poppler-utils
```
提供机构:
bdpc
原始信息汇总
数据集概述
数据集信息
- 许可证: cc-by-nc-4.0
数据集特征
- id: 字符串类型
- file: 二进制类型
- labels: 分类标签,包括以下类别:
- 0: letter
- 1: form
- 2: email
- 3: handwritten
- 4: advertisement
- 5: scientific report
- 6: scientific publication
- 7: specification
- 8: file folder
- 9: news article
- 10: budget
- 11: invoice
- 12: presentation
- 13: questionnaire
- 14: resume
- 15: memo
数据集分割
- test:
- 字节数: 1349159996
- 示例数: 991
数据集大小
- 下载大小: 0
- 数据集大小: 1349159996



