five

DynaSent

收藏
arXiv2020-12-31 更新2024-06-21 收录
下载链接:
https://github.com/cgpotts/dynasent
下载链接
链接失效反馈
官方服务:
资源简介:
DynaSent是由斯坦福大学创建的一个英语情感分析动态基准数据集,包含121,634条经过五名众包工作者验证的自然发生句子和使用Dynabench平台创建的句子。该数据集旨在通过引入新的模型、建模目标和新对抗攻击来动态扩展,以推动情感分析领域的发展。数据集的设计使得即使是最佳模型也仅能达到随机性能,从而鼓励开发更先进的解决方案。DynaSent特别关注中性类别的语义一致性,并提倡每次迭代从零开始训练模型,以避免微调带来的问题。

DynaSent is a dynamic benchmark dataset for English sentiment analysis created by Stanford University. It comprises 121,634 sentences, including both naturally occurring sentences verified by five crowdworkers and sentences created using the Dynabench platform. This dataset is designed to enable dynamic expansion by introducing new models, modeling objectives and novel adversarial attacks, thereby advancing the field of sentiment analysis. The dataset is structured such that even state-of-the-art models can only achieve random-level performance, thereby encouraging the development of more advanced solutions. DynaSent places special emphasis on the semantic consistency of the neutral class, and advocates training models from scratch in each iteration to avoid issues caused by fine-tuning.
提供机构:
斯坦福大学
创建时间:
2020-12-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作