five

GIL-UNAM/negation_twitter_mexican_spanish

收藏
Hugging Face2023-05-17 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/GIL-UNAM/negation_twitter_mexican_spanish
下载链接
链接失效反馈
官方服务:
资源简介:
# :notebook: Negation and Sentiment Detection on Mexican Spanish Tweets: The T-MexNeg Corpus In Spanish, there are three basics levels of negation: lexical, morphological, and syntactic. This corpus addreesses only the syntactic negation. Negative sentences express false states or the nonexistence of the action that is in the sentence and they might also change sentiment within lexical alignments in a text. Syntax negation is a syntax operator word that affects the whole sentence or a section of it. This syntax operator is called negation cue. They can be adverbs, prepositions, indefinite pronouns, and conjunctions. Usually, in Spanish, negation cues precede the verb, but they can also appear postponed. The section affected by the negation cue is called scope. The words that are specifically reached by it, which can be verbs, nouns, or phrases, are referred to as event. Therefore, the basic requirements to create a negative sentence are the negation cue, the scope, and the event. ## :page_facing_up: Corpus Description The T-MexNeg corpus of Tweets written in Mexican Spanish. It consists of 13,704 Tweets, of which 4895 contain negation structures. The corpus is the result of an analysis of sentiment and negation statements embedded in the language employed on social media. This repository includes annotation guidelines along with the corpus, manually annotated with labels of sentiment, negation cue, scope, and, event. Twitter was used as the innitial source of the corpus; the tweets are a random subset of a set collected from Mexican users from September 2017 to April 2019. ## :paperclip: Tags Description Each entry in the corpus consists of a tweet with two components: the content, and the sentiment tag. Within the content, the annotation identifies three main negation components: Negation Cue, Event, and Scope. It also differentiate among three types of negation cues: Simple Negation (**neg_exp**), Related Negation (**neg_rel**), and False Negation (**no_neg**). - **neg_exp** : It refers to the negation cues that are not linked to other negation cues \ref{1c}. Thus, the Scope and the Event are only directly related to this negation. - **neg_rel** : This label is used for negation cues that are linked to other negation cues in the sentence and are dependent on them. The related negation does not have an event or scope and it is part of the scope of the main negation. - **no_neg** : This tag is used with negation cues that do not negate anything at a semantic level, as well as with some abbreviations and idiomatic phrases and discursive markers. - **event** : The Event labels the word or words that are specifically negated. - **scope** : This tag corresponds to all words that are affected by the negation. The general structure of an entry in the corpus would present the tags as follows: ``` <tweet> <polarity> 'NEGATIVE/POSITIVE/NEUTRAL' </polarity> <content> <neg_structure> <scope> <negexp class='simple/related/no_neg'> </negexp> <event> </event> </scope> </neg_structure> </content> </tweet> ``` ## :pencil: Citing If you use the corpus please use the following BibTeX: ``` @Article{app11093880, AUTHOR = {Bel-Enguix, Gemma and Gómez-Adorno, Helena and Pimentel, Alejandro and Ojeda-Trueba, Sergio-Luis and Aguilar-Vizuet, Brian}, TITLE = {Negation Detection on Mexican Spanish Tweets: The T-MexNeg Corpus}, JOURNAL = {Applied Sciences}, VOLUME = {11}, YEAR = {2021}, NUMBER = {9}, ARTICLE-NUMBER = {3880}, URL = {https://www.mdpi.com/2076-3417/11/9/3880}, ISSN = {2076-3417}, DOI = {10.3390/app11093880} } ``` ## Aknowledgments This resource was funded by CONACyT project CB A1-S-27780, DGAPA-UNAM PAPIIT grants number TA400121 and TA100520.
提供机构:
GIL-UNAM
原始信息汇总

数据集概述

数据集名称

  • T-MexNeg Corpus

数据集描述

  • 语言: 墨西哥西班牙语
  • 来源: 推特(Twitter)
  • 时间范围: 2017年9月至2019年4月
  • 数据量: 总共13,704条推文,其中4,895条包含否定结构
  • 内容: 包含情感和否定声明的分析,手动标注了情感标签、否定提示、范围和事件
  • 结构: 每条推文包含内容和情感标签,内容中标注了否定提示、事件和范围

标注细节

  • 否定提示类型:
    • Simple Negation (neg_exp): 独立的否定提示,直接关联范围和事件
    • Related Negation (neg_rel): 依赖于其他否定提示的否定提示,不独立拥有事件或范围
    • False Negation (no_neg): 语义上不否定任何内容的否定提示,包括缩写、习语和话语标记
  • 事件: 被具体否定的词或词组
  • 范围: 受否定影响的词或词组

引用信息

  • 作者: Bel-Enguix, Gemma 等
  • 标题: Negation Detection on Mexican Spanish Tweets: The T-MexNeg Corpus
  • 期刊: Applied Sciences
  • 年份: 2021
  • 卷/期: 11/9
  • 文章编号: 3880
  • DOI: 10.3390/app11093880
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作