Racoci/CORAA-top-by-category
收藏Hugging Face2024-06-02 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/Racoci/CORAA-top-by-category
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
dataset_info:
features:
- name: 'Unnamed: 0'
dtype: int64
- name: file_path
dtype: string
- name: task
dtype: string
- name: variety
dtype: string
- name: dataset
dtype: string
- name: accent
dtype: string
- name: speech_genre
dtype: string
- name: speech_style
dtype: string
- name: up_votes
dtype: int64
- name: down_votes
dtype: int64
- name: votes_for_hesitation
dtype: float64
- name: votes_for_filled_pause
dtype: float64
- name: votes_for_noise_or_low_voice
dtype: float64
- name: votes_for_second_voice
dtype: float64
- name: votes_for_no_identified_problem
dtype: float64
- name: text
dtype: string
- name: category
dtype: string
- name: class_id
dtype: int64
- name: text_length
dtype: int64
- name: audio
dtype: audio
splits:
- name: top_3_by_category
num_bytes: 20401557.0
num_examples: 18
- name: top_10_by_category
num_bytes: 65020891.0
num_examples: 60
- name: top_100_by_category
num_bytes: 484045505.0
num_examples: 600
download_size: 569287144
dataset_size: 569467953.0
configs:
- config_name: default
data_files:
- split: top_3_by_category
path: data/top_3_by_category-*
- split: top_10_by_category
path: data/top_10_by_category-*
- split: top_100_by_category
path: data/top_100_by_category-*
---
提供机构:
Racoci
原始信息汇总
数据集概述
许可证
- 许可证类型:cc-by-nc-4.0
数据集特征
- 特征列表:
Unnamed: 0:数据类型为int64file_path:数据类型为stringtask:数据类型为stringvariety:数据类型为stringdataset:数据类型为stringaccent:数据类型为stringspeech_genre:数据类型为stringspeech_style:数据类型为stringup_votes:数据类型为int64down_votes:数据类型为int64votes_for_hesitation:数据类型为float64votes_for_filled_pause:数据类型为float64votes_for_noise_or_low_voice:数据类型为float64votes_for_second_voice:数据类型为float64votes_for_no_identified_problem:数据类型为float64text:数据类型为stringcategory:数据类型为stringclass_id:数据类型为int64text_length:数据类型为int64audio:数据类型为audio
数据集分割
- 分割名称:
top_3_by_category- 字节数:20401557.0
- 样本数:18
- 分割名称:
top_10_by_category- 字节数:65020891.0
- 样本数:60
- 分割名称:
top_100_by_category- 字节数:484045505.0
- 样本数:600
数据集大小
- 下载大小:569287144
- 数据集大小:569467953.0
配置
- 配置名称:
default- 数据文件:
- 分割:
top_3_by_category,路径:data/top_3_by_category-* - 分割:
top_10_by_category,路径:data/top_10_by_category-* - 分割:
top_100_by_category,路径:data/top_100_by_category-*
- 分割:
- 数据文件:



