five

ctoraman/BilTweetNews-sentiment-analysis

收藏
Hugging Face2023-11-29 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/ctoraman/BilTweetNews-sentiment-analysis
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-4.0 language: - tr task_categories: - text-classification tags: - sentiment analysis - text classification - tweets - social media - turkish - sarcasm - sarcastic --- # Turkish Sentiment Analysis Tweet Dataset: BilTweetNews The dataset contains tweets related to six major events from Turkish news sources between May 4, 2015 and Jan 8, 2017. The dataset covers 6 major events: - May 25, 2015 One of the popular football clubs in Turkey, Galatasaray, wins the 2015 Turkish Super League. - Sep 6, 2015 A terrorist group, called PKK, attacked to soldiers in Dağlıca, a village in southeastern Turkey. - Oct 7, 2015 A Turkish scientist, Aziz Sancar, won the 2015 Nobel Chemistry prize with his studies on DNA repair. - May 27, 2016 A local football club of Alanya promoted to the Turkish Super League for the first time in their history. - Jun 17, 2016 A traditional anthem that is mostly played by secularists in Turkey, called the 10th Year Anthem, was forbidden in schools by the director of national education in the Black Sea province of Bolu. - Oct 17, 2016 A magazine programmer confused that Madonna in a Fur Coat, a book written in 1943 by a Turkish celebrated writer, Sabahattin Ali, was about popstar Madonna’s life. The book tells a story between a Turkish student and German singer after the World War I. - Not related to any news topic For each event, 100 related-candidate and 60 unrelated-candidate tweets are selected. Lastly, we randomly select 40 tweets that are potentially not related at all, 5 of them are removed due to detecting near-duplicates later. The dataset has 995 tweets in total. There are 4 sentiment classes: - Positive - Negative - Neutral - Sarcastic All tweets are labeled by 17 annotators. We provide the normalized distribution of annotations across 4 sentiment classes. We also provide the majority sentiment class at the last column. If there are multiple classes with highest scores, then we set "Multi" as majority. Github Repo: https://github.com/BilkentInformationRetrievalGroup/BilTweetNews2017 # If you would like to use any material in this repository, please cite the following papers: - Toraman, C. Early Prediction of Public Reactions to News Events Using Microblogs. Seventh BCS-IRSG Symposium on Future Directions in Information Access (FDIA 2017), Barcelona, Spain, 5 September 2017. - Toraman, C. Event-related microblog retrieval in Turkish. Turkish Journal of Electrical Engineering and Computer Sciences. 2021. DOI: 10.3906/elk-2108-167 ****
提供机构:
ctoraman
原始信息汇总

Turkish Sentiment Analysis Tweet Dataset: BilTweetNews

数据集概述

  • 语言: 土耳其语
  • 任务类别: 文本分类
  • 标签: 情感分析, 文本分类, 推文, 社交媒体, 土耳其语, 讽刺, 讽刺性

数据集内容

  • 时间范围: 2015年5月4日至2017年1月8日
  • 事件数量: 6个主要事件
  • 事件详情:
    • 2015年5月25日: 土耳其著名足球俱乐部加拉塔萨雷赢得2015年土耳其超级联赛。
    • 2015年9月6日: 恐怖组织PKK袭击土耳其东南部Dağlıca村的士兵。
    • 2015年10月7日: 土耳其科学家Aziz Sancar因其DNA修复研究获得2015年诺贝尔化学奖。
    • 2016年5月27日: 阿拉尼亚当地足球俱乐部首次晋升至土耳其超级联赛。
    • 2016年6月17日: 土耳其黑海省Bolu的国家教育局长禁止在学校播放传统的第十年国歌。
    • 2016年10月17日: 一家杂志节目误解了土耳其著名作家Sabahattin Ali在1943年写的书《毛皮大衣中的麦当娜》,认为它讲述了流行明星麦当娜的生活。
    • 与任何新闻主题无关
  • 推文数量: 总计995条推文
    • 每个事件选择100条相关候选推文和60条不相关候选推文
    • 随机选择40条可能完全不相关的推文,其中5条因检测到近似重复而被移除

情感类别

  • 类别: 积极, 消极, 中性, 讽刺
  • 标注: 所有推文由17名标注者进行标注
  • 标注分布: 提供4种情感类别的标准化分布
  • 多数情感类别: 最后一列提供多数情感类别,若多个类别得分最高,则标记为"Multi"
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作