gigant/tib-wip-filtered
收藏Hugging Face2023-03-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/gigant/tib-wip-filtered
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: doi
dtype: string
- name: title
dtype: string
- name: url
dtype: string
- name: video_url
dtype: string
- name: license
dtype: string
- name: subject
dtype: string
- name: genre
dtype: string
- name: release_year
dtype: string
- name: author
dtype: string
- name: contributors
dtype: string
- name: abstract
dtype: string
- name: transcript
dtype: string
- name: transcript_segments
sequence:
- name: id
dtype: int32
- name: seek
dtype: int32
- name: start
dtype: float32
- name: end
dtype: float32
- name: text
dtype: string
- name: tokens
sequence: int32
- name: temperature
dtype: float32
- name: avg_logprob
dtype: float32
- name: compression_ratio
dtype: float32
- name: no_speech_prob
dtype: float32
- name: keyframes
sequence:
- name: slide
dtype: string
- name: frames
sequence: int32
- name: timestamp
sequence: float32
- name: language
dtype: string
splits:
- name: train
num_bytes: 1062896143.5539255
num_examples: 9294
download_size: 511200645
dataset_size: 1062896143.5539255
---
# Dataset Card for "tib-wip-filtered"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
gigant
原始信息汇总
数据集概述
数据集名称
- 名称: tib-wip-filtered
数据集特征
- doi: 字符串类型
- title: 字符串类型
- url: 字符串类型
- video_url: 字符串类型
- license: 字符串类型
- subject: 字符串类型
- genre: 字符串类型
- release_year: 字符串类型
- author: 字符串类型
- contributors: 字符串类型
- abstract: 字符串类型
- transcript: 字符串类型
- transcript_segments: 序列类型,包含以下子特征:
- id: 整数类型 (int32)
- seek: 整数类型 (int32)
- start: 浮点类型 (float32)
- end: 浮点类型 (float32)
- text: 字符串类型
- tokens: 序列类型,整数类型 (int32)
- temperature: 浮点类型 (float32)
- avg_logprob: 浮点类型 (float32)
- compression_ratio: 浮点类型 (float32)
- no_speech_prob: 浮点类型 (float32)
- keyframes: 序列类型,包含以下子特征:
- slide: 字符串类型
- frames: 序列类型,整数类型 (int32)
- timestamp: 序列类型,浮点类型 (float32)
- language: 字符串类型
数据集分割
- train:
- num_bytes: 1062896143.5539255 字节
- num_examples: 9294 个样本
数据集大小
- download_size: 511200645 字节
- dataset_size: 1062896143.5539255 字节



