dvquys/restaurant-reviews-public-sources
收藏Hugging Face2024-07-10 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/dvquys/restaurant-reviews-public-sources
下载链接
链接失效反馈官方服务:
资源简介:
该数据集用于识别餐厅评论中提到的餐厅方面的信息,包括实体(如食物、氛围等)及其相关情感。数据集包含训练集、验证集和测试集,分别有1590、398和10个样本。数据集的特征包括id、text、Comments、tokens和ner_tags,其中ner_tags用于标注不同的实体类别和情感。数据集的输入文本来源于SemEval数据集,训练和验证集的标签是通过Llama3生成的,而测试集的标签是手动整理的。
This dataset is for the task of identifying the aspects of the restaurants mentioned in the reviews where aspect contains information about both the entities (FOOD, AMBIENCE, ...) and the attached sentiments. The dataset includes training, validation, and test sets with 1590, 398, and 10 samples respectively. The features of the dataset include id, text, Comments, tokens, and ner_tags, where ner_tags are used to label different entity categories and sentiments. The input texts are from the SemEval dataset, with labels for the train and val datasets generated by prompting Llama3, while the test dataset is curated manually.
提供机构:
dvquys
原始信息汇总
Restaurant Reviews Parsing NER Aspects
数据集概述
该数据集用于识别餐厅评论中提到的各个方面,包括实体(如食物、氛围等)及其相关情感。
数据集特征
- id: 字符串类型
- text: 大字符串类型
- Comments: 字符串序列
- tokens: 字符串序列
- ner_tags: 命名实体识别标签序列
- 标签类别:
- 0: O
- 1: B-AMBIENCE
- 2: I-AMBIENCE
- 3: B-BEVERAGE
- 4: I-BEVERAGE
- 5: B-FOOD
- 6: I-FOOD
- 7: B-LOCATION
- 8: I-LOCATION
- 9: B-OVERALL
- 10: I-OVERALL
- 11: B-PRICE
- 12: I-PRICE
- 13: B-SERVICE
- 14: I-SERVICE
- 15: B-STAFF
- 16: I-STAFF
- 17: B-VALUE
- 18: I-VALUE
- 19: B-VIEW
- 20: I-VIEW
- 标签类别:
数据集划分
- train:
- 样本数量: 1590
- 数据大小: 675122 字节
- val:
- 样本数量: 398
- 数据大小: 163216 字节
- test:
- 样本数量: 10
- 数据大小: 4680 字节
数据集配置
- default:
- train: data/train-*
- val: data/val-*
- test: data/test-*
任务类别
- 命名实体识别 (token-classification)
标签生成
- train 和 val 数据集: 通过 Llama3 提示生成
- test 数据集: 手动整理
其他信息
- 下载大小: 318714 字节
- 数据集总大小: 843018 字节
- 数据集名称: Restaurant Reviews Parsing NER Aspects
- 数据集规模: 1K < n < 10K
- 语言: 英语
- 标签: travel, restaurant



