five

PINEAPPLE(Personifying INanimate Entities by Acquiring Parallel Personification Data for Learning Enhanced )

收藏
OpenDataLab2026-05-31 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/PINEAPPLE
下载链接
链接失效反馈
官方服务:
资源简介:
菠萝的数据集和代码: 通过获取用于学习增强生成的并行拟人化数据来拟人化无生命实体 (COLING 22) 拟人化是一种言语,赋予无生命实体通常被视为需要动画的属性和动作。在本文中,我们探讨了拟人化生成的任务。为此,我们提出了菠萝: 通过获取并行拟人化数据来拟人化无生命实体,以学习增强的生成。我们策划了一个名为PersonifCorp的拟人化语料库,以及自动生成的这些拟人化的去拟人化语文化。我们通过训练seq2seq模型来拟人化给定的文字输入来证明此并行语料库的有用性。自动评估和人为评估都表明,使用PersonifCorp进行微调可以显着提高与拟人化相关的品质,例如活力和趣味性。详细的定性分析还强调了菠萝在基线上的主要优势和缺陷,表明了产生多样化和创造性人格化的强大能力,从而增强了句子的整体吸引力。

Dataset and Code for Pineapple: Anthropomorphizing Inanimate Entities via Acquiring Parallel Anthropomorphic Data for Learning-Augmented Generation (COLING 22) Anthropomorphism refers to a linguistic strategy that endows inanimate entities with attributes and actions typically associated with living beings. In this paper, we explore the task of automatic anthropomorphic generation. To facilitate this task, we present Pineapple, a framework for anthropomorphizing inanimate entities via acquiring parallel anthropomorphic data for learning-augmented generation. We have curated an anthropomorphic corpus named PersonifCorp, along with automatically generated de-anthropomorphized versions of these anthropomorphic texts. We demonstrate the practical utility of this parallel corpus by fine-tuning a sequence-to-sequence (seq2seq) model to generate anthropomorphic descriptions given input text. Both automatic and human evaluation results indicate that fine-tuning with PersonifCorp significantly enhances anthropomorphism-related qualities including vitality and interestingness. Detailed qualitative analysis further identifies the key strengths and limitations of Pineapple relative to baseline models, revealing its strong capability to produce diverse and creative anthropomorphic outputs, thereby boosting the overall appeal of the generated sentences.
提供机构:
OpenDataLab
创建时间:
2022-10-17
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
PINEAPPLE是一个用于拟人化生成的文本数据集,通过构建包含拟人化与去拟人化并行语料的PersonifCorp来训练seq2seq模型,旨在提升生成文本的活力和趣味性,相关成果发表于2022年COLING会议。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作