M-ABSA
收藏M-ABSA 数据集概述
数据集基本信息
- 名称: M-ABSA (Multilingual Dataset for Aspect-Based Sentiment Analysis)
- 任务类型: 多语言方面级情感分析 (Multilingual ABSA) 与三元组抽取 (Triplet Extraction)
- 论文链接: arXiv:2502.11824
数据内容
-
领域覆盖: 7个领域 python domains = ["coursera", "hotel", "laptop", "restaurant", "phone", "sight", "food"]
-
语言覆盖: 21种语言 python langs = ["ar", "da", "de", "en", "es", "fr", "hi", "hr", "id", "ja", "ko", "nl", "pt", "ru", "sk", "sv", "sw", "th", "tr", "vi", "zh"]
-
标注格式: 三元组结构
[aspect term, aspect category, sentiment polarity] -
数据分割: 训练集、验证集、测试集
-
数据示例:
This coffee brews up a nice medium roast with exotic floral and berry notes .####[[coffee, food quality, positive]]
实验设置
基线模型
- 推荐环境:
- transformers==4.0.0
- sentencepiece==0.1.91
- pytorch_lightning==0.8.1
- 模型要求: 需下载mT5-base模型 (https://huggingface.co/google/mt5-base)
- 任务参数:
tasd: 三元组抽取uabsa: (方面词-情感极性)对抽取
- 运行示例: bash python main.py --task tasd --dataset hotel --model_name_or_path mt5-base --paradigm extraction --n_gpu 0 --do_train --do_direct_eval --train_batch_size 16 --gradient_accumulation_steps 2 --eval_batch_size 16 --learning_rate 3e-4 --num_train_epochs 5
大语言模型评估
- 支持模型:
gemma,llama,mistral,qwen - 运行示例: bash python {model}_{task}.py --test_lang "en" --type "food"
引用格式
bibtex @misc{wu2025mabsa, title={M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis}, author={Chengyan Wu and Bolei Ma and Yihong Liu and Zheyu Zhang and Ningyuan Deng and Yanshu Li and Baolan Chen and Yi Zhang and Barbara Plank and Yun Xue}, year={2025}, eprint={2502.11824}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2502.11824}, }




