five

AlexSham/imdb_filtered

收藏
Hugging Face2024-03-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/AlexSham/imdb_filtered
下载链接
链接失效反馈
官方服务:
资源简介:
https://archive.ics.uci.edu/dataset/331/sentiment+labelled+sentences This dataset was created for the Paper 'From Group to Individual Labels using Deep Features', Kotzias et. al,. KDD 2015 Please cite the paper if you want to use it :) It contains sentences labelled with positive or negative sentiment, extracted from reviews of products, movies, and restaurants ======= Format: ======= sentence \t score \n ======= Details: ======= Score is either 1 (for positive) or 0 (for negative) The sentences come from three different websites/fields: imdb.com amazon.com yelp.com For each website, there exist 500 positive and 500 negative sentences. Those were selected randomly for larger datasets of reviews. We attempted to select sentences that have a clearly positive or negative connotaton, the goal was for no neutral sentences to be selected. For the full datasets look: imdb: Maas et. al., 2011 'Learning word vectors for sentiment analysis' amazon: McAuley et. al., 2013 'Hidden factors and hidden topics: Understanding rating dimensions with review text' yelp: Yelp dataset challenge http://www.yelp.com/dataset_challenge
提供机构:
AlexSham
原始信息汇总

数据集概述

数据来源

  • 数据集包含从以下三个网站提取的句子:
    • imdb.com
    • amazon.com
    • yelp.com

数据内容

  • 每个网站包含500个正面和500个负面句子。
  • 句子被随机从更大的评论数据集中选中。
  • 选中的句子具有明确的正面或负面含义,不含中性句子。

标签说明

  • 句子标签为1表示正面,0表示负面。

数据格式

  • 数据格式为:sentence score
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作