atmallen/amazon_polarity_embeddings_random2
收藏Hugging Face2023-09-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/atmallen/amazon_polarity_embeddings_random2
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
dataset_info:
features:
- name: content
dtype: string
- name: label
dtype:
class_label:
names:
'0': neg
'1': pos
- name: embedding
sequence: float32
- name: title
dtype: string
splits:
- name: train
num_bytes: 7148364432
num_examples: 3600000
- name: test
num_bytes: 19940712
num_examples: 10000
download_size: 3900873029
dataset_size: 7168305144
---
# Dataset Card for "amazon_polarity_embeddings_random2"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset named amazon_polarity_embeddings_random2 includes a default configuration with data files divided into training and test sets. The dataset features include content, label, embedding, and title. The labels are categorized into positive and negative. The dataset is split into training and test sets, containing 3,600,000 and 10,000 samples respectively. The total download size of the dataset is 3,900,873,029 bytes, and the total size is 7,168,305,144 bytes.
提供机构:
atmallen
原始信息汇总
数据集概述
配置
- 默认配置 (
default)- 数据文件路径:
- 训练集 (
train):data/train-* - 测试集 (
test):data/test-*
- 训练集 (
- 数据文件路径:
数据集信息
-
特征:
content: 字符串类型label: 类别标签- 类别名称:
0: 负向 (neg)1: 正向 (pos)
- 类别名称:
embedding: 浮点数序列 (float32)title: 字符串类型
-
数据分割:
- 训练集 (
train)- 字节数: 7148364432
- 样本数: 3600000
- 测试集 (
test)- 字节数: 19940712
- 样本数: 10000
- 训练集 (
-
数据集大小:
- 下载大小: 3900873029 字节
- 数据集总大小: 7168305144 字节



