five

ineoApp/data-2024-06-04

收藏
Hugging Face2024-06-04 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ineoApp/data-2024-06-04
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: string - name: image dtype: image - name: bboxes sequence: sequence: int64 - name: ner_tags sequence: class_label: names: '0': O '1': numero facture '2': date facture '3': date limite '4': montant ht '5': montant ttc '6': tva '7': prix tva '8': reference '9': Devise '10': Condition de paiement '11': Mode de paiement '12': vendeur '13': adresse vendeur '14': informations vendeur '15': ice vendeur '16': rc vendeur '17': if vendeur '18': patente vendeur '19': acheteur '20': adresse acheteur '21': informations acheteur '22': ice acheteur '23': art1 Article '24': art1 designation '25': art1 quantite '26': art1 unite '27': art1 prix unit '28': art1 montant ht '29': art1 taux de remise '30': art1 tva '31': art2 Article '32': art2 designation '33': art2 quantite '34': art2 unite '35': art2 prix unit '36': art2 montant ht '37': art2 taux de remise '38': art2 tva '39': art3 Article '40': art3 designation '41': art3 quantite '42': art3 unite '43': art3 prix unit '44': art3 montant ht '45': art3 taux de remise '46': art3 tva '47': art4 Article '48': art4 designation '49': art4 quantite '50': art4 unite '51': art4 prix unit '52': art4 montant ht '53': art4 taux de remise '54': art4 tva '55': art5 Article '56': art5 designation '57': art5 quantite '58': art5 unite '59': art5 prix unit '60': art5 montant ht '61': art5 taux de remise '62': art5 tva '63': art6 Article '64': art6 designation '65': art6 quantite '66': art6 unite '67': art6 prix unit '68': art6 montant ht '69': art6 taux de remise '70': art6 tva '71': art7 Article '72': art7 designation '73': art7 quantite '74': art7 unite '75': art7 prix unit '76': art7 montant ht '77': art7 taux de remise '78': art7 tva '79': art8 Article '80': art8 designation '81': art8 quantite '82': art8 unite '83': art8 prix unit '84': art8 montant ht '85': art8 taux de remise '86': art8 tva '87': art9 Article '88': art9 designation '89': art9 quantite '90': art9 unite '91': art9 prix unit '92': art9 montant ht '93': art9 taux de remise '94': art9 tva '95': art10 Article '96': art10 designation '97': art10 quantite '98': art10 unite '99': art10 prix unit '100': art10 montant ht '101': art10 taux de remise '102': art10 tva '103': art11 Article '104': art11 designation '105': art11 quantite '106': art11 unite '107': art11 prix unit '108': art11 montant ht '109': art11 taux de remise '110': art11 tva '111': art12 Article '112': art12 designation '113': art12 quantite '114': art12 unite '115': art12 prix unit '116': art12 montant ht '117': art12 taux de remise '118': art12 tva '119': art13 Article '120': art13 designation '121': art13 quantite '122': art13 unite '123': art13 prix unit '124': art13 montant ht '125': art13 taux de remise '126': art13 tva '127': art14 Article '128': art14 designation '129': art14 quantite '130': art14 unite '131': art14 prix unit '132': art14 montant ht '133': art14 taux de remise '134': art14 tva - name: tokens sequence: string splits: - name: train num_bytes: 492051241.2589792 num_examples: 423 - name: test num_bytes: 123303620.7410208 num_examples: 106 download_size: 585614365 dataset_size: 615354862.0 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* ---

This dataset is primarily used for image and text processing, featuring various attributes such as images, bounding boxes, named entity tags, and text tokens. The named entity tags in the dataset cover a variety of invoice-related information, such as invoice numbers, dates, amounts, etc. The dataset is divided into training and test sets, suitable for training and evaluating models for image and text processing.
提供机构:
ineoApp
原始信息汇总

数据集概述

数据集特征

  • id: 字符串类型
  • image: 图像类型
  • bboxes: 序列类型,内部序列为整数类型
  • ner_tags: 序列类型,包含多个类别标签,具体标签及其含义如下:
    • 0: O
    • 1: numero facture
    • 2: date facture
    • 3: date limite
    • 4: montant ht
    • 5: montant ttc
    • 6: tva
    • 7: prix tva
    • 8: reference
    • 9: Devise
    • 10: Condition de paiement
    • 11: Mode de paiement
    • 12: vendeur
    • 13: adresse vendeur
    • 14: informations vendeur
    • 15: ice vendeur
    • 16: rc vendeur
    • 17: if vendeur
    • 18: patente vendeur
    • 19: acheteur
    • 20: adresse acheteur
    • 21: informations acheteur
    • 22: ice acheteur
    • 23: art1 Article
    • ...(省略后续标签)
  • tokens: 序列类型,字符串类型

数据集分割

  • train: 包含423个样本,数据量约为492MB
  • test: 包含106个样本,数据量约为123MB

数据集大小

  • 下载大小: 585MB
  • 数据集总大小: 615MB

数据文件配置

  • default 配置下,训练集和测试集的数据文件路径分别为:
    • 训练集: data/train-*
    • 测试集: data/test-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作