five

numeric_fused_head

收藏
OpenCSG2024-07-19 更新2026-01-19 收录
下载链接:
https://opencsg.com/datasets/AIWizards/numeric_fused_head?tab=summary
下载链接
链接失效反馈
官方服务:
资源简介:
Numeric Fused Heads 仓库专注于数字融合头的识别与解析,它包含英文文本数据,规模介于十万到一百万条样本之间,同时也包含一千到一万条样本的数据集。数据集采用MIT许可,支持数字融合头的识别和解析任务。数据集中,tokens 经过 Spacy 分词处理,并提供了起始索引、结束索引以及标签等信息,用于识别任务;同时还包括行索引、头部信息、说话者以及锚点索引等,用于解析任务。数据来源于众包、专家生成和机器生成,并划分了训练集、测试集和验证集。

The Numeric Fused Heads repository focuses on the recognition and parsing of numeric fused heads. It encompasses English text datasets with two scales: one ranging from 100,000 to 1,000,000 samples, and the other containing 1,000 to 10,000 samples. Licensed under the MIT License, this dataset supports tasks for the recognition and parsing of numeric fused heads. For recognition tasks, tokens in the dataset have been tokenized with SpaCy, and are equipped with start indices, end indices and label annotations to facilitate the recognition process. For parsing tasks, additional fields including row indices, header information, speakers and anchor indices are provided. The dataset's data is sourced from crowdsourced, expert-generated and machine-generated resources, and has been split into training, test and validation sets.
提供机构:
AIWizards
创建时间:
2024-07-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作