max-long/textile_glossaries_and_pile_ner
收藏Hugging Face2024-12-19 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/max-long/textile_glossaries_and_pile_ner
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是为微调命名实体识别(NER)模型而创建的,特别针对20世纪初英国历史纺织工业的领域特定知识。数据集包含来自历史纺织术语词汇表的2504个术语和来自Pile-NER-type数据集的4000个示例,以避免过拟合。数据集的结构包括tokenized_text和ner两个字段,部分条目还包括negative字段。
This dataset was produced for the purpose of fine-tuning a Named Entity Recognition (NER) model with domain-specific knowledge relevant to the historic textile industry of the United Kingdom around the turn of the twentieth century. The dataset includes data from historic textile glossaries and the Pile-NER-type dataset. The terms are extracted from four specialist books, covering various entity types such as textile manufacturing chemicals, textile dyes, textile machinery, and more. The dataset structure includes tokenized_text and ner fields, with some entries also including a negative field.
提供机构:
max-long



