albinandersson/Nordisk-Familjebok-Category-Classification-Dataset
收藏Hugging Face2024-12-10 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/albinandersson/Nordisk-Familjebok-Category-Classification-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含来自Nordisk Familjebok的6000个手动注释的文本条目,这些条目被分类为三个类别:地点、人物和其他杂项条目。数据集分为训练集和测试集,分别包含5100和900个条目。每个条目包含定义和类别标签。数据集采用CC BY-NC-SA 4.0许可证,允许非商业用途的使用、修改和共享。
This dataset contains 6,000 manually annotated text entries from Nordisk Familjebok, classified into three categories: locations, persons, and other miscellaneous entries. The dataset is divided into a training set of 5,100 entries and a test set of 900 entries. The distribution of categories in both sets is also provided. The dataset is licensed under CC BY-NC-SA 4.0, allowing non-commercial use, modification, and sharing with appropriate credit. The files included are train_set.jsonl and test_set.jsonl, both in JSONL format. The data fields include definition and type, where type is the category label (0: Other, 1: Location, 2: Person).
提供机构:
albinandersson



