besiktas/m2w-cands
收藏Hugging Face2024-02-10 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/besiktas/m2w-cands
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: actions
list:
- name: neg_candidates
list:
- name: after
struct:
- name: prob
struct:
- name: paddle
sequence: float64
- name: tesseract
sequence: float64
- name: text
struct:
- name: paddle
sequence: string
- name: tesseract
sequence: string
- name: backend_node_id
dtype: string
- name: before
struct:
- name: prob
struct:
- name: paddle
sequence: float64
- name: tesseract
sequence: float64
- name: text
struct:
- name: paddle
sequence: string
- name: tesseract
sequence: string
- name: bounding_box
sequence: int64
- name: cand_idx
dtype: int64
- name: pos_candidates
list:
- name: after
struct:
- name: prob
struct:
- name: paddle
sequence: float64
- name: tesseract
sequence: float64
- name: text
struct:
- name: paddle
sequence: string
- name: tesseract
sequence: string
- name: backend_node_id
dtype: string
- name: before
struct:
- name: prob
struct:
- name: paddle
sequence: float64
- name: tesseract
sequence: float64
- name: text
struct:
- name: paddle
sequence: string
- name: tesseract
sequence: string
- name: bounding_box
sequence: int64
- name: cand_idx
dtype: int64
- name: annotation_id
dtype: string
splits:
- name: test
num_bytes: 18695
num_examples: 2
- name: train
num_bytes: 62501
num_examples: 2
download_size: 55576
dataset_size: 81196
configs:
- config_name: default
data_files:
- split: test
path: data/test-*
- split: train
path: data/train-*
---
提供机构:
besiktas
原始信息汇总
数据集概述
特征信息
- actions
- neg_candidates
- after
- prob
- paddle: 序列类型,float64
- tesseract: 序列类型,float64
- text
- paddle: 序列类型,string
- tesseract: 序列类型,string
- prob
- backend_node_id: 数据类型,string
- before
- prob
- paddle: 序列类型,float64
- tesseract: 序列类型,float64
- text
- paddle: 序列类型,string
- tesseract: 序列类型,string
- prob
- bounding_box: 序列类型,int64
- cand_idx: 数据类型,int64
- after
- pos_candidates
- after
- prob
- paddle: 序列类型,float64
- tesseract: 序列类型,float64
- text
- paddle: 序列类型,string
- tesseract: 序列类型,string
- prob
- backend_node_id: 数据类型,string
- before
- prob
- paddle: 序列类型,float64
- tesseract: 序列类型,float64
- text
- paddle: 序列类型,string
- tesseract: 序列类型,string
- prob
- bounding_box: 序列类型,int64
- cand_idx: 数据类型,int64
- after
- neg_candidates
- annotation_id: 数据类型,string
数据分割
- test
- 字节数: 18695
- 样本数: 2
- train
- 字节数: 62501
- 样本数: 2
数据集大小
- 下载大小: 55576
- 数据集大小: 81196
配置信息
- default
- 数据文件
- test
- 路径: data/test-*
- train
- 路径: data/train-*
- test
- 数据文件



