albinandersson/Nordisk-Familjebok-Headword-Extraction-Dataset
收藏Hugging Face2024-12-10 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/albinandersson/Nordisk-Familjebok-Headword-Extraction-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含从Nordisk Familjebok所有版本中提取的文本和对应的关键词对。每个条目包含一个文本字段和一个关键词字段,如果文本没有对应的关键词,则关键词字段为空。该数据集设计用于训练和评估关键词提取模型。数据集分为训练集和测试集,训练集包含未经验证的文本-关键词对,测试集包含手动验证的条目以确保更高的准确性和质量。
This dataset contains pairs of extracted text and their corresponding headwords from all editions of Nordisk Familjebok. Each entry includes a `text` and a `headword` field. If a text has a corresponding headword, the `headword` field is populated; otherwise, it contains an empty string. This dataset is designed for training and evaluating headword extraction models.
提供机构:
albinandersson



