five

max-long/textile_glossaries_and_pile_ner

收藏
Hugging Face2024-12-19 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/max-long/textile_glossaries_and_pile_ner
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是为微调命名实体识别(NER)模型而创建的,特别针对20世纪初英国历史纺织工业的领域特定知识。数据集包含来自历史纺织术语词汇表的2504个术语和来自Pile-NER-type数据集的4000个示例,以避免过拟合。数据集的结构包括tokenized_text和ner两个字段,部分条目还包括negative字段。

This dataset was produced for the purpose of fine-tuning a Named Entity Recognition (NER) model with domain-specific knowledge relevant to the historic textile industry of the United Kingdom around the turn of the twentieth century. The dataset includes data from historic textile glossaries and the Pile-NER-type dataset. The terms are extracted from four specialist books, covering various entity types such as textile manufacturing chemicals, textile dyes, textile machinery, and more. The dataset structure includes tokenized_text and ner fields, with some entries also including a negative field.
提供机构:
max-long
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作