five

Siki-77/yelp_3classes

收藏
Hugging Face2024-07-11 更新2024-07-13 收录
下载链接:
https://hf-mirror.com/datasets/Siki-77/yelp_3classes
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含来自Yelp的评论数据,每条评论都有一个标签表示情感倾向(负面、中性、正面)。数据集分为训练集和测试集,训练集包含650,000条评论,测试集包含50,000条评论。标签0表示负面(来自0和1星评价),1表示中性(来自2星评价),2表示正面(来自3和4星评价)。

This dataset contains review data from Yelp, with each review labeled for sentiment (negative, neutral, positive). The dataset is divided into a training set with 650,000 reviews and a test set with 50,000 reviews. Label 0 indicates negative (from 0 and 1-star ratings), 1 indicates neutral (from 2-star ratings), and 2 indicates positive (from 3 and 4-star ratings).
提供机构:
Siki-77
原始信息汇总

数据集概述

数据集配置

  • config_name: default

    • features:
      • label: int64
      • t_label: string
      • text: string
    • splits:
      • train:
        • num_bytes: 491481554
        • num_examples: 650000
      • test:
        • num_bytes: 37861188
        • num_examples: 50000
    • download_size: 323230667
    • dataset_size: 529342742
  • config_name: test

    • features:
      • label: int64
      • t_label: string
      • text: string
    • splits:
      • train:
        • num_bytes: 37861188
        • num_examples: 50000
    • download_size: 23535516
    • dataset_size: 37861188

数据文件路径

  • config_name: default
    • data_files:
      • split: train
        • path: data/train-*
      • split: test
        • path: data/test-*

标签定义

  • 0: Negative (来自星级 0 和 1)
  • 1: Neutral (来自星级 2)
  • 2: Positive (来自星级 3 和 4)

示例

json { "label": 0, "t_label": "Negative", "text": "dr. goldberg offers everything i look for in a general practitioner. hes nice and easy to talk to without being patronizing; hes always on time in seeing his patients; hes affiliated with a top-notch hospital (nyu) which my parents have explained to me is very important in case something happens and you need surgery; and you can get referrals to see specialists without having to see him first. really, what more do you need? im sitting here trying to think of any complaints i have about him, but im really drawing a blank." }

数据来源

  • 来源: Yelp
  • 引用: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作