machinesonpaper/MoPE
收藏Hugging Face2024-01-30 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/machinesonpaper/MoPE
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
---
MoPE contains over 100 examples of copy editing errors from English-language news publications. The examples are mostly from publications owned by The New York Times Company (NYT, Wirecutter and The Athletic), primarily because the Internet has developed somewhat of a sport in playing grammar police for them. But also because The Times has high editing standards, so the errors that sneak through are often grammatically interesting.
GPT-4 has a 54% error rate on the task of identifying the word with an error or the word closest to the error.
The dataset is maintained by the editors of [Machines on Paper](https://www.machinesonpaper.com/). We would love to add copy errors from other publications. You can send suggestions to hello@machinesonpaper.com.
提供机构:
machinesonpaper
原始信息汇总
MoPE 数据集概述
数据集内容
- 数据来源:MoPE 数据集包含超过 100 个来自英语新闻出版物的校对错误示例。这些示例主要来自《纽约时报》公司旗下的出版物(如 NYT、Wirecutter 和《The Athletic》)。
- 数据特点:这些错误通常具有语法上的趣味性,因为《纽约时报》具有较高的编辑标准。
数据集性能
- GPT-4 错误率:在识别错误单词或最接近错误的单词的任务上,GPT-4 的错误率为 54%。
数据集维护
- 维护机构:该数据集由 Machines on Paper 的编辑维护。
- 贡献方式:欢迎添加来自其他出版物的校对错误,建议可通过电子邮件 hello@machinesonpaper.com 提交。



