FrancophonIA/POPCORN
收藏Hugging Face2025-03-30 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/FrancophonIA/POPCORN
下载链接
链接失效反馈官方服务:
资源简介:
POPCORN数据集是一个法语信息提取数据集,包含400个手动标注的验证文本和400个训练文本。文本简短、事实性强,类似于信息报告的风格。数据集的标注基于以下本体进行,使得可以训练和评估信息提取(命名实体识别、共指消解和关系提取)模型。数据集的注释文本存储在仓库的corpus文件夹中,并分为train.json和test.json两个文件,每个文件包含400个文本。每个文本以包含文本、实体和关系的三键字典形式存储。
The POPCORN dataset is a French Information Extraction dataset containing 400 manually annotated validation texts and 400 training texts. The texts are short, fact-based, and in the style of an informational report. The dataset annotation is based on the following ontology, allowing for the training and evaluation of Information Extraction models (Named Entity Recognition, Coreference Resolution, and Relation Extraction). The annotated texts are stored in the corpus folder of the repository and split into two files, train.json and test.json, each containing 400 texts stored as a three-key dictionary containing text, entities, and relations.
提供机构:
FrancophonIA



