five

CEIMaT2021 Dataset

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7149187
下载链接
链接失效反馈
官方服务:
资源简介:
This package contains the dataset generated in the research published in the paper: "Almudena Sánchez Ruíz, Daniel Galan, Ángel García-Beltrán and Javier Rodríguez-Vidal. Detecting topics and polarity from Twitter a university faculty case" The dataset is available for research purpose. The CEIMaT2021 Dataset contains data from Twitter (http://www.twitter.com/), where anyone can express their own opinion by short messages shared publicly. Specifically, CEIMaT2021 Dataset includes tweets related with Escuela Técnica Superior de Ingenieros Industriales of Universidad Politécnica de Madrid (ETSII-UPM).  To build the dataset, we automatically extracted all the existing tweets written in Spanish with reference to ETSII-UPM. To do this, we took into consideration four different groups: 1. Tweets published by the user @industrialesupm, official Twitter account of ETSII-UPM. 2. Tweets with mention to @industrialesupm, independent of who published it. 3. Tweets including the hashtag #etsii. 4. Tweets including the hashtag #industrialesupm. In total, we extracted 18971 tweets, but not all of them were strictly related to ETSII-UPM or written in Spanish. Therefore, we filtered them and delete the unnecessary tweets. After this process, we obtained a final set of 11014 tweets, which are collected in CEIMaT2021 Dataset. Three experts related with ETSII-UPM annotated each tweet depending on their polarity and topic, as: Polarity: POSITIVE, NEUTRAL, NEGATIVE. Topic: EVENTS, EXAMS, COMPUTING, TEACHING AND RESEARCH, INSTITUTION, SERVICES, INFRASTRUCTURE, OTHER.

本数据包包含发表于论文《Almudena Sánchez Ruíz、Daniel Galan、Ángel García-Beltrán 及 Javier Rodríguez-Vidal. 从推特(Twitter)检测大学院系的主题与情感倾向》的研究数据集。 本数据集仅可用于科研用途。 CEIMaT2021数据集收录了来自推特(Twitter,官网为http://www.twitter.com/)的数据:任何用户均可通过公开分享的短文本表达自身观点。具体而言,CEIMaT2021数据集包含与马德里理工大学(Universidad Politécnica de Madrid)工业高等技术学院(Escuela Técnica Superior de Ingenieros Industriales,简称ETSII-UPM)相关的推文。 为构建该数据集,我们自动提取了所有以西班牙语撰写且与ETSII-UPM相关的现有推文。为此,我们纳入了四类不同来源的推文: 1. 由ETSII-UPM官方推特账号@industrialesupm发布的推文; 2. 提及@industrialesupm的推文,发布主体不限; 3. 包含话题标签#etsii的推文; 4. 包含话题标签#industrialesupm的推文。 我们总计提取到18971条推文,但其中并非所有内容均与ETSII-UPM严格相关,也并非均以西班牙语撰写。因此我们对其进行了筛选,剔除了无关推文。经此处理流程后,最终得到11014条推文,即CEIMaT2021数据集的完整收录内容。三位与ETSII-UPM相关的专家针对每条推文的情感倾向与主题进行了人工标注,标注标准如下: - 情感倾向(Polarity):正面(POSITIVE)、中性(NEUTRAL)、负面(NEGATIVE) - 主题(Topic):事件(EVENTS)、考试(EXAMS)、计算机(COMPUTING)、教学与科研(TEACHING AND RESEARCH)、院校事务(INSTITUTION)、服务(SERVICES)、基础设施(INFRASTRUCTURE)、其他(OTHER)
创建时间:
2023-10-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作