CATIE-AQ/termith-eval_fr_prompt_keywords_extraction
收藏Hugging Face2025-02-10 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/CATIE-AQ/termith-eval_fr_prompt_keywords_extraction
下载链接
链接失效反馈官方服务:
资源简介:
termith-eval_fr_prompt_keywords_extraction数据集是从Dataset of French Prompts (DFP)中提取的一个子集,包含8,295行数据,专门用于关键词提取任务。原始数据来源于termith-eval数据集,通过应用21个不同的提示来构建输入和目标列,以匹配xP3数据集的格式。这些提示包括使用指示性语气、非正式和正式语气的形式,旨在从文本中提取重要关键词。数据集仅包含训练集,没有验证集和测试集。
The termith-eval_fr_prompt_keywords_extraction dataset is a subset extracted from the Dataset of French Prompts (DFP), comprising 8,295 rows and specifically designed for keyword extraction tasks. The original data originates from the termith-eval dataset, where 21 distinct prompts were applied to construct the input and target columns to match the format of the xP3 dataset. These prompts cover forms using indicative mood, informal and formal tones, aiming to extract important keywords from text. This dataset only contains a training split, with no validation or test splits.
提供机构:
CATIE-AQ
原始信息汇总
termith-eval_fr_prompt_keywords_extraction 数据集概述
基本信息
- 语言: 法语
- 许可: CC-BY-4.0
- 数据规模: 10K<n<100K
- 任务类别: 文本生成
- 标签: 关键词提取, DFP, 法语提示
- 注释创建者: 发现
- 语言创建者: 发现
- 多语言性: 单语
- 源数据集: taln-ls2n/termith-eval
数据集详情
- 名称: termith-eval_fr_prompt_keywords_extraction
- 来源: 来自 Dataset of French Prompts (DFP) 的子集
- 行数: 8,295 行
- 任务: 关键词提取
- 原始数据: 来自 termith-eval 数据集
- 提示列表: 21 个提示,使用直陈式、第二人称单数和第二人称复数形式
数据分割
train: 8,295 样本- 无
valid分割 - 无
test分割
使用方法
python from datasets import load_dataset dataset = load_dataset("CATIE-AQ/termith-eval_fr_prompt_keywords_extraction")
引用
原始数据
- (Boudin, 2013) Florian Boudin. 2013. [TALN Archives : a digital archive of French research articles in Natural Language Processing (TALN Archives : une archive numérique francophone des articles de recherche en Traitement Automatique de la Langue) [in French]][boudin-2013]. In Proceedings of TALN 2013 (Volume 2: Short Papers), pages 507–514, Les Sables d’Olonne, France. ATALA.
- (Boudin and Gallina, 2021) Florian Boudin and Ygor Gallina. 2021. [Redefining Absent Keyphrases and their Effect on Retrieval Effectiveness][boudin-2021]. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4185–4193, Online. Association for Computational Linguistics.
本数据集
@misc {centre_aquitain_des_technologies_de_linformation_et_electroniques_2023,
author = { {Centre Aquitain des Technologies de lInformation et Electroniques} },
title = { DFP (Revision 1d24c09) },
year = 2023,
url = { https://huggingface.co/datasets/CATIE-AQ/DFP },
doi = { 10.57967/hf/1200 },
publisher = { Hugging Face }
}
许可
CC-BY-4.0



