ManuSharma/tokenized-sentiment-analysis

Name: ManuSharma/tokenized-sentiment-analysis
Creator: ManuSharma
Published: 2024-07-07 13:24:34
License: 暂无描述

Hugging Face2024-07-07 更新2024-07-22 收录

下载链接：

https://hf-mirror.com/datasets/ManuSharma/tokenized-sentiment-analysis

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含五个特征：stmt（语句）、sentiment（情感）、input_ids（输入ID序列）、attention_mask（注意力掩码序列）和labels（标签序列）。数据集分为训练集和测试集两个部分，训练集包含4361个样本，测试集包含485个样本。数据集的下载大小为35919字节，总大小为3149900字节。数据文件路径分别为data/train-*和data/test-*。

The dataset contains five features: stmt (statement), sentiment, input_ids (sequence of input IDs), attention_mask (sequence of attention masks), and labels (sequence of labels). The dataset is divided into two parts: a training set with 4361 samples and a test set with 485 samples. The download size of the dataset is 35919 bytes, and the total size is 3149900 bytes. The data files are located at data/train-* and data/test-*.

提供机构：

ManuSharma

原始信息汇总

数据集概述

特征信息

stmt: 字符串类型
sentiment: 字符串类型
input_ids: 整数序列，类型为int32
attention_mask: 整数序列，类型为int8
labels: 整数序列，类型为int64

数据分割

train:
- 样本数量: 4361
- 数据大小: 2834650.0字节
test:
- 样本数量: 485
- 数据大小: 315250.0字节

数据集大小

下载大小: 35919字节
总数据大小: 3149900.0字节

配置信息

config_name: default
- data_files:
  - train: data/train-*
  - test: data/test-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集