five

frankier/multiscale_rt_critics_subsets

收藏
Hugging Face2023-10-04 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/frankier/multiscale_rt_critics_subsets
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: multiscale_rt_critics features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 splits: - name: train num_bytes: 4951005 num_examples: 23182 - name: test num_bytes: 1644530 num_examples: 7745 - name: validation num_bytes: 1646302 num_examples: 7731 download_size: 0 dataset_size: 8241837 - config_name: rt_critics_big_irregular_5 features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 - name: orig_group_id dtype: uint32 splits: - name: train num_bytes: 2336759 num_examples: 10312 - name: test num_bytes: 781228 num_examples: 3441 - name: validation num_bytes: 779150 num_examples: 3438 download_size: 1927630 dataset_size: 3897137 - config_name: rt_critics_by_critic_1000pl features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 - name: orig_group_id dtype: uint32 splits: - name: train num_bytes: 27083039 num_examples: 124055 - name: test num_bytes: 9049344 num_examples: 41406 - name: validation num_bytes: 9026209 num_examples: 41368 download_size: 22594175 dataset_size: 45158592 - config_name: rt_critics_by_critic_500pl features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 - name: task_ids dtype: uint32 - name: orig_group_id dtype: uint32 splits: - name: train num_bytes: 41656780 num_examples: 189382 - name: test num_bytes: 13929707 num_examples: 63263 - name: validation num_bytes: 13917936 num_examples: 63157 download_size: 35087274 dataset_size: 69504423 - config_name: rt_critics_one features: - name: movie_title dtype: string - name: publisher_name dtype: string - name: critic_name dtype: string - name: text dtype: string - name: review_score dtype: string - name: grade_type dtype: string - name: orig_num dtype: float32 - name: orig_denom dtype: float32 - name: includes_zero dtype: bool - name: label dtype: uint8 - name: scale_points dtype: uint8 - name: multiplier dtype: uint8 splits: - name: train num_bytes: 988767 num_examples: 4606 - name: test num_bytes: 327725 num_examples: 1536 - name: validation num_bytes: 327038 num_examples: 1536 download_size: 951057 dataset_size: 1643530 configs: - config_name: multiscale_rt_critics data_files: - split: train path: multiscale_rt_critics/train-* - split: test path: multiscale_rt_critics/test-* - split: validation path: multiscale_rt_critics/validation-* - config_name: rt_critics_big_irregular_5 data_files: - split: train path: rt_critics_big_irregular_5/train-* - split: test path: rt_critics_big_irregular_5/test-* - split: validation path: rt_critics_big_irregular_5/validation-* - config_name: rt_critics_by_critic_1000pl data_files: - split: train path: rt_critics_by_critic_1000pl/train-* - split: test path: rt_critics_by_critic_1000pl/test-* - split: validation path: rt_critics_by_critic_1000pl/validation-* - config_name: rt_critics_by_critic_500pl data_files: - split: train path: rt_critics_by_critic_500pl/train-* - split: test path: rt_critics_by_critic_500pl/test-* - split: validation path: rt_critics_by_critic_500pl/validation-* - config_name: rt_critics_one data_files: - split: train path: rt_critics_one/train-* - split: test path: rt_critics_one/test-* - split: validation path: rt_critics_one/validation-* --- # Dataset Card for "multiscale_rt_critics_subsets" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
frankier
原始信息汇总

数据集概述

数据集配置

1. multiscale_rt_critics

  • 特征:
    • movie_title: 字符串
    • publisher_name: 字符串
    • critic_name: 字符串
    • text: 字符串
    • review_score: 字符串
    • grade_type: 字符串
    • orig_num: 浮点数 (float32)
    • orig_denom: 浮点数 (float32)
    • includes_zero: 布尔值
    • label: 无符号整数 (uint8)
    • scale_points: 无符号整数 (uint8)
    • multiplier: 无符号整数 (uint8)
    • task_ids: 无符号整数 (uint32)
  • 分割:
    • train: 4951005 字节, 23182 样本
    • test: 1644530 字节, 7745 样本
    • validation: 1646302 字节, 7731 样本
  • 数据大小:
    • 下载大小: 0 字节
    • 数据集大小: 8241837 字节

2. rt_critics_big_irregular_5

  • 特征:
    • movie_title: 字符串
    • publisher_name: 字符串
    • critic_name: 字符串
    • text: 字符串
    • review_score: 字符串
    • grade_type: 字符串
    • orig_num: 浮点数 (float32)
    • orig_denom: 浮点数 (float32)
    • includes_zero: 布尔值
    • label: 无符号整数 (uint8)
    • scale_points: 无符号整数 (uint8)
    • multiplier: 无符号整数 (uint8)
    • task_ids: 无符号整数 (uint32)
    • orig_group_id: 无符号整数 (uint32)
  • 分割:
    • train: 2336759 字节, 10312 样本
    • test: 781228 字节, 3441 样本
    • validation: 779150 字节, 3438 样本
  • 数据大小:
    • 下载大小: 1927630 字节
    • 数据集大小: 3897137 字节

3. rt_critics_by_critic_1000pl

  • 特征:
    • movie_title: 字符串
    • publisher_name: 字符串
    • critic_name: 字符串
    • text: 字符串
    • review_score: 字符串
    • grade_type: 字符串
    • orig_num: 浮点数 (float32)
    • orig_denom: 浮点数 (float32)
    • includes_zero: 布尔值
    • label: 无符号整数 (uint8)
    • scale_points: 无符号整数 (uint8)
    • multiplier: 无符号整数 (uint8)
    • task_ids: 无符号整数 (uint32)
    • orig_group_id: 无符号整数 (uint32)
  • 分割:
    • train: 27083039 字节, 124055 样本
    • test: 9049344 字节, 41406 样本
    • validation: 9026209 字节, 41368 样本
  • 数据大小:
    • 下载大小: 22594175 字节
    • 数据集大小: 45158592 字节

4. rt_critics_by_critic_500pl

  • 特征:
    • movie_title: 字符串
    • publisher_name: 字符串
    • critic_name: 字符串
    • text: 字符串
    • review_score: 字符串
    • grade_type: 字符串
    • orig_num: 浮点数 (float32)
    • orig_denom: 浮点数 (float32)
    • includes_zero: 布尔值
    • label: 无符号整数 (uint8)
    • scale_points: 无符号整数 (uint8)
    • multiplier: 无符号整数 (uint8)
    • task_ids: 无符号整数 (uint32)
    • orig_group_id: 无符号整数 (uint32)
  • 分割:
    • train: 41656780 字节, 189382 样本
    • test: 13929707 字节, 63263 样本
    • validation: 13917936 字节, 63157 样本
  • 数据大小:
    • 下载大小: 35087274 字节
    • 数据集大小: 69504423 字节

5. rt_critics_one

  • 特征:
    • movie_title: 字符串
    • publisher_name: 字符串
    • critic_name: 字符串
    • text: 字符串
    • review_score: 字符串
    • grade_type: 字符串
    • orig_num: 浮点数 (float32)
    • orig_denom: 浮点数 (float32)
    • includes_zero: 布尔值
    • label: 无符号整数 (uint8)
    • scale_points: 无符号整数 (uint8)
    • multiplier: 无符号整数 (uint8)
  • 分割:
    • train: 988767 字节, 4606 样本
    • test: 327725 字节, 1536 样本
    • validation: 327038 字节, 1536 样本
  • 数据大小:
    • 下载大小: 951057 字节
    • 数据集大小: 1643530 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作