eduagarcia/tweetsentbr_fewshot
收藏数据集概述
基本信息
- 语言: 葡萄牙语 (pt)
- 大小分类: 1K<n<10K
- 任务分类: 文本分类
数据集特征
- id: 整数类型 (int64)
- sentence: 字符串类型 (string)
- label: 字符串类型 (string)
数据集划分
- 训练集: 75个样本,占用空间6830字节
- 测试集: 2010个样本,占用空间178392字节
下载与数据集大小
- 下载大小: 117996字节
- 数据集大小: 185222字节
配置
- 默认配置:
- 训练数据路径: data/train-*
- 测试数据路径: data/test-*
数据集描述
- 内容: TweetSentBR是一个巴西葡萄牙语的推文语料库,用于情感分析任务。每个推文被标注为以下三种类别之一:
- Positive: 用户对帖子主题有积极反应或评价的推文
- Negative: 用户对帖子主题有消极反应或评价的推文
- Neutral: 不属于前两类的推文,通常不表达观点或无关紧要
引用信息
bibtex @InProceedings{BRUM18.389, author = {Henrico Brum and Maria das Grac{c}as Volpe Nunes}, title = "{Building a Sentiment Corpus of Tweets in Brazilian Portuguese}", booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {May 7-12, 2018}, address = {Miyazaki, Japan}, editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and HÚlŔne Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga}, publisher = {European Language Resources Association (ELRA)}, isbn = {979-10-95546-00-9}, language = {english} }



