goosmanlei/amazon_reviews_multi
收藏Hugging Face2026-03-18 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/goosmanlei/amazon_reviews_multi
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: Amazon Reviews Multi
language:
- de
- en
- es
- fr
- ja
- zh
license: other
task_categories:
- text-classification
size_categories:
- 1M<n<10M
configs:
- config_name: all_languages
data_files:
- split: train
path:
- de/train.jsonl.gz
- en/train.jsonl.gz
- es/train.jsonl.gz
- fr/train.jsonl.gz
- ja/train.jsonl.gz
- zh/train.jsonl.gz
- split: validation
path:
- de/validation.jsonl.gz
- en/validation.jsonl.gz
- es/validation.jsonl.gz
- fr/validation.jsonl.gz
- ja/validation.jsonl.gz
- zh/validation.jsonl.gz
- split: test
path:
- de/test.jsonl.gz
- en/test.jsonl.gz
- es/test.jsonl.gz
- fr/test.jsonl.gz
- ja/test.jsonl.gz
- zh/test.jsonl.gz
- config_name: de
data_files:
- split: train
path: de/train.jsonl.gz
- split: validation
path: de/validation.jsonl.gz
- split: test
path: de/test.jsonl.gz
- config_name: en
data_files:
- split: train
path: en/train.jsonl.gz
- split: validation
path: en/validation.jsonl.gz
- split: test
path: en/test.jsonl.gz
- config_name: es
data_files:
- split: train
path: es/train.jsonl.gz
- split: validation
path: es/validation.jsonl.gz
- split: test
path: es/test.jsonl.gz
- config_name: fr
data_files:
- split: train
path: fr/train.jsonl.gz
- split: validation
path: fr/validation.jsonl.gz
- split: test
path: fr/test.jsonl.gz
- config_name: ja
data_files:
- split: train
path: ja/train.jsonl.gz
- split: validation
path: ja/validation.jsonl.gz
- split: test
path: ja/test.jsonl.gz
- config_name: zh
data_files:
- split: train
path: zh/train.jsonl.gz
- split: validation
path: zh/validation.jsonl.gz
- split: test
path: zh/test.jsonl.gz
---
# Amazon Reviews Multi (Data Files Version)
This dataset hosts the multilingual Amazon Reviews corpus as raw `jsonl.gz` data files for direct loading via `datasets` without using a dataset script.
## Source
- Original dataset name: `amazon_reviews_multi`
- Original dataset card: https://huggingface.co/datasets/amazon_reviews_multi
- Mirror used for raw file retrieval: https://huggingface.co/datasets/buruzaemon/amazon_reviews_multi
## Features
Each record contains:
- `review_id`
- `product_id`
- `reviewer_id`
- `stars`
- `review_body`
- `review_title`
- `language`
- `product_category`
## Splits
For each language (`de`, `en`, `es`, `fr`, `ja`, `zh`):
- `train`: 200,000
- `validation`: 5,000
- `test`: 5,000
## License
Please refer to the original Amazon Reviews Multi dataset terms and conditions.
提供机构:
goosmanlei



