voidful/NMSQA-CODE

Name: voidful/NMSQA-CODE
Creator: voidful
Published: 2023-07-24 18:30:24
License: 暂无描述

Hugging Face2023-07-24 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/voidful/NMSQA-CODE

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: en dataset_info: features: - name: id dtype: string - name: title dtype: string - name: context dtype: string - name: question dtype: string - name: answers struct: - name: answer_start sequence: int64 - name: audio_full_answer_end sequence: float64 - name: audio_full_answer_start sequence: float64 - name: audio_segment_answer_end sequence: float64 - name: audio_segment_answer_start sequence: float64 - name: text sequence: string - name: content_segment_audio_path dtype: string - name: content_full_audio_path dtype: string - name: content_audio_sampling_rate dtype: float64 - name: content_audio_speaker dtype: string - name: content_segment_text dtype: string - name: content_segment_normalized_text dtype: string - name: question_audio_path dtype: string - name: question_audio_sampling_rate dtype: float64 - name: question_audio_speaker dtype: string - name: question_normalized_text dtype: string - name: hubert_100_context_unit dtype: string - name: hubert_100_question_unit dtype: string - name: hubert_100_answer_unit dtype: string - name: mhubert_1000_context_unit dtype: string - name: mhubert_1000_question_unit dtype: string - name: mhubert_1000_answer_unit dtype: string splits: - name: train num_bytes: 3329037982 num_examples: 87599 - name: test num_bytes: 1079782 num_examples: 171 - name: dev num_bytes: 411186265 num_examples: 10570 download_size: 507994561 dataset_size: 3741304029 --- # Dataset Card for "NMSQA-CODE" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

提供机构：

voidful

原始信息汇总

数据集概述

数据集特征

id: 字符串类型
title: 字符串类型
context: 字符串类型
question: 字符串类型
answers: 结构体类型，包含以下字段：
- answer_start: 整数类型
- audio_full_answer_end: 浮点数类型
- audio_full_answer_start: 浮点数类型
- audio_segment_answer_end: 浮点数类型
- audio_segment_answer_start: 浮点数类型
- text: 字符串类型
content_segment_audio_path: 字符串类型
content_full_audio_path: 字符串类型
content_audio_sampling_rate: 浮点数类型
content_audio_speaker: 字符串类型
content_segment_text: 字符串类型
content_segment_normalized_text: 字符串类型
question_audio_path: 字符串类型
question_audio_sampling_rate: 浮点数类型
question_audio_speaker: 字符串类型
question_normalized_text: 字符串类型
hubert_100_context_unit: 字符串类型
hubert_100_question_unit: 字符串类型
hubert_100_answer_unit: 字符串类型
mhubert_1000_context_unit: 字符串类型
mhubert_1000_question_unit: 字符串类型
mhubert_1000_answer_unit: 字符串类型

数据集分割

train: 87599个样本，占用3329037982字节
test: 171个样本，占用1079782字节
dev: 10570个样本，占用411186265字节

数据集大小

下载大小: 507994561字节
数据集总大小: 3741304029字节

5,000+

优质数据集

54 个

任务类型

进入经典数据集