five

SLLMBias/cont_stereoset

收藏
Hugging Face2024-06-07 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/SLLMBias/cont_stereoset
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: default data_files: - split: validation path: data/validation-* dataset_info: features: - name: id dtype: string - name: target dtype: string - name: bias_type dtype: string - name: context dtype: string - name: sentences struct: - name: gold_label sequence: int64 - name: id sequence: string - name: labels list: - name: human_id sequence: string - name: label sequence: int64 - name: sentence sequence: string - name: ASR_LLM_Prompt dtype: string - name: SLLM_Prompt dtype: string - name: audio dtype: audio: sampling_rate: 16000 - name: tts_provider dtype: string - name: speaker dtype: string splits: - name: validation num_bytes: 140044246.0 num_examples: 1452 download_size: 117125075 dataset_size: 140044246.0 --- # Dataset Card for "cont_stereoset" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

The dataset named cont_stereoset includes multiple configurations and features such as id, target, bias_type, context, sentences, etc. Each feature has its specific data type, such as string, int64, etc. The dataset is divided into a validation split, containing 1452 samples. Additionally, the dataset includes audio features with a sampling rate of 16000. The total download size of the dataset is 117125075 bytes, and the dataset size is 140044246.0 bytes.
提供机构:
SLLMBias
原始信息汇总

数据集概述

数据集配置

  • 配置名称: default
  • 数据文件:
    • 分割: validation
    • 路径: data/validation-*

数据集信息

  • 特征:
    • id: 数据类型为字符串
    • target: 数据类型为字符串
    • bias_type: 数据类型为字符串
    • context: 数据类型为字符串
    • sentences: 结构化数据,包含以下子特征:
      • gold_label: 序列类型为int64
      • id: 序列类型为字符串
      • labels: 列表类型,包含:
        • human_id: 序列类型为字符串
        • label: 序列类型为int64
      • sentence: 序列类型为字符串
    • ASR_LLM_Prompt: 数据类型为字符串
    • SLLM_Prompt: 数据类型为字符串
    • audio: 数据类型为音频,采样率为16000
    • tts_provider: 数据类型为字符串
    • speaker: 数据类型为字符串

数据集分割

  • 分割: validation
  • 数据大小: 140044246.0字节
  • 示例数量: 1452

下载与数据集大小

  • 下载大小: 117125075字节
  • 数据集大小: 140044246.0字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作