WikiIns
收藏arXiv2023-10-08 更新2024-06-21 收录
下载链接:
https://github.com/CasparSwift/WikiIns
下载链接
链接失效反馈官方服务:
资源简介:
WikiIns是由北京大学王选计算机技术研究所创建的高质量文本编辑数据集,专注于通过自然语言指令进行受控文本编辑。该数据集包含6060条数据,来源于维基百科编辑历史数据库,经过预处理和人工注释形成。WikiIns旨在通过提供更丰富的自然语言指令,改善现有文本编辑数据集的信息量不足问题。数据集的应用领域包括文本简化、句子融合和语法错误修正等,旨在通过精确的指令指导,提高文本编辑模型的性能和应用范围。
WikiIns is a high-quality text editing dataset developed by the Wangxuan Institute of Computer Technology at Peking University, which focuses on controlled text editing via natural language instructions. The dataset consists of 6060 samples sourced from Wikipedia's edit history database, and has been constructed through preprocessing and manual annotation. WikiIns aims to alleviate the insufficient information problem of existing text editing datasets by providing more abundant natural language instructions. Its application fields include text simplification, sentence fusion, grammatical error correction, etc., and it is designed to improve the performance and application scope of text editing models through precise instruction-based guidance.
提供机构:
王选计算机技术研究所,北京大学
创建时间:
2023-10-08



