frankier/multiscale_rt_critics_subsets

Name: frankier/multiscale_rt_critics_subsets
Creator: frankier
Published: 2023-10-04 06:16:28
License: 暂无描述

Hugging Face2023-10-04 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/frankier/multiscale_rt_critics_subsets

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: - config_name: multiscale_rt_critics features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 splits: - name: train num_bytes: 4951005 num_examples: 23182 - name: test num_bytes: 1644530 num_examples: 7745 - name: validation num_bytes: 1646302 num_examples: 7731 download_size: 0 dataset_size: 8241837 - config_name: rt_critics_big_irregular_5 features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 - name: orig_group_id dtype: uint32 splits: - name: train num_bytes: 2336759 num_examples: 10312 - name: test num_bytes: 781228 num_examples: 3441 - name: validation num_bytes: 779150 num_examples: 3438 download_size: 1927630 dataset_size: 3897137 - config_name: rt_critics_by_critic_1000pl features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 - name: orig_group_id dtype: uint32 splits: - name: train num_bytes: 27083039 num_examples: 124055 - name: test num_bytes: 9049344 num_examples: 41406 - name: validation num_bytes: 9026209 num_examples: 41368 download_size: 22594175 dataset_size: 45158592 - config_name: rt_critics_by_critic_500pl features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 - name: orig_group_id dtype: uint32 splits: - name: train num_bytes: 41656780 num_examples: 189382 - name: test num_bytes: 13929707 num_examples: 63263 - name: validation num_bytes: 13917936 num_examples: 63157 download_size: 35087274 dataset_size: 69504423 - config_name: rt_critics_one features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 splits: - name: train num_bytes: 988767 num_examples: 4606 - name: test num_bytes: 327725 num_examples: 1536 - name: validation num_bytes: 327038 num_examples: 1536 download_size: 951057 dataset_size: 1643530 configs: - config_name: multiscale_rt_critics data_files: - split: train path: multiscale_rt_critics/train-* - split: test path: multiscale_rt_critics/test-* - split: validation path: multiscale_rt_critics/validation-* - config_name: rt_critics_big_irregular_5 data_files: - split: train path: rt_critics_big_irregular_5/train-* - split: test path: rt_critics_big_irregular_5/test-* - split: validation path: rt_critics_big_irregular_5/validation-* - config_name: rt_critics_by_critic_1000pl data_files: - split: train path: rt_critics_by_critic_1000pl/train-* - split: test path: rt_critics_by_critic_1000pl/test-* - split: validation path: rt_critics_by_critic_1000pl/validation-* - config_name: rt_critics_by_critic_500pl data_files: - split: train path: rt_critics_by_critic_500pl/train-* - split: test path: rt_critics_by_critic_500pl/test-* - split: validation path: rt_critics_by_critic_500pl/validation-* - config_name: rt_critics_one data_files: - split: train path: rt_critics_one/train-* - split: test path: rt_critics_one/test-* - split: validation path: rt_critics_one/validation-* --- # Dataset Card for "multiscale_rt_critics_subsets" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

提供机构：

frankier

原始信息汇总

数据集概述

数据集配置

1. multiscale_rt_critics

特征:
- movie_title: 字符串
- publisher_name: 字符串
- critic_name: 字符串
- text: 字符串
- review_score: 字符串
- grade_type: 字符串
- orig_num: 浮点数 (float32)
- orig_denom: 浮点数 (float32)
- includes_zero: 布尔值
- label: 无符号整数 (uint8)
- scale_points: 无符号整数 (uint8)
- multiplier: 无符号整数 (uint8)
- task_ids: 无符号整数 (uint32)
分割:
- train: 4951005 字节, 23182 样本
- test: 1644530 字节, 7745 样本
- validation: 1646302 字节, 7731 样本
数据大小:
- 下载大小: 0 字节
- 数据集大小: 8241837 字节

2. rt_critics_big_irregular_5

特征:
- movie_title: 字符串
- publisher_name: 字符串
- critic_name: 字符串
- text: 字符串
- review_score: 字符串
- grade_type: 字符串
- orig_num: 浮点数 (float32)
- orig_denom: 浮点数 (float32)
- includes_zero: 布尔值
- label: 无符号整数 (uint8)
- scale_points: 无符号整数 (uint8)
- multiplier: 无符号整数 (uint8)
- task_ids: 无符号整数 (uint32)
- orig_group_id: 无符号整数 (uint32)
分割:
- train: 2336759 字节, 10312 样本
- test: 781228 字节, 3441 样本
- validation: 779150 字节, 3438 样本
数据大小:
- 下载大小: 1927630 字节
- 数据集大小: 3897137 字节

3. rt_critics_by_critic_1000pl

特征:
- movie_title: 字符串
- publisher_name: 字符串
- critic_name: 字符串
- text: 字符串
- review_score: 字符串
- grade_type: 字符串
- orig_num: 浮点数 (float32)
- orig_denom: 浮点数 (float32)
- includes_zero: 布尔值
- label: 无符号整数 (uint8)
- scale_points: 无符号整数 (uint8)
- multiplier: 无符号整数 (uint8)
- task_ids: 无符号整数 (uint32)
- orig_group_id: 无符号整数 (uint32)
分割:
- train: 27083039 字节, 124055 样本
- test: 9049344 字节, 41406 样本
- validation: 9026209 字节, 41368 样本
数据大小:
- 下载大小: 22594175 字节
- 数据集大小: 45158592 字节

4. rt_critics_by_critic_500pl

特征:
- movie_title: 字符串
- publisher_name: 字符串
- critic_name: 字符串
- text: 字符串
- review_score: 字符串
- grade_type: 字符串
- orig_num: 浮点数 (float32)
- orig_denom: 浮点数 (float32)
- includes_zero: 布尔值
- label: 无符号整数 (uint8)
- scale_points: 无符号整数 (uint8)
- multiplier: 无符号整数 (uint8)
- task_ids: 无符号整数 (uint32)
- orig_group_id: 无符号整数 (uint32)
分割:
- train: 41656780 字节, 189382 样本
- test: 13929707 字节, 63263 样本
- validation: 13917936 字节, 63157 样本
数据大小:
- 下载大小: 35087274 字节
- 数据集大小: 69504423 字节

5. rt_critics_one

特征:
- movie_title: 字符串
- publisher_name: 字符串
- critic_name: 字符串
- text: 字符串
- review_score: 字符串
- grade_type: 字符串
- orig_num: 浮点数 (float32)
- orig_denom: 浮点数 (float32)
- includes_zero: 布尔值
- label: 无符号整数 (uint8)
- scale_points: 无符号整数 (uint8)
- multiplier: 无符号整数 (uint8)
分割:
- train: 988767 字节, 4606 样本
- test: 327725 字节, 1536 样本
- validation: 327038 字节, 1536 样本
数据大小:
- 下载大小: 951057 字节
- 数据集大小: 1643530 字节

5,000+

优质数据集

54 个

任务类型

进入经典数据集