SPoRC
收藏huggingface.co2025-01-15 收录
下载链接:
https://huggingface.co/datasets/blitt/SPoRC
下载链接
链接失效反馈官方服务:
资源简介:
SPORC: the Structured Podcast Open Research Corpus (V 1.0)
SPORC is a large multimodal dataset for the study of the podcast ecosystem. Included in our data are podcast metadata, transcripts, speaker-turn labels, speaker-role labels, and speaker audio features. For more information on the collection and processing of this data alongside an initial analysis of the podcast ecosystem please refer to our paper here or our github repositories for analysis and data processing.
Our… See the full description on the dataset page: https://huggingface.co/datasets/blitt/SPoRC.
SPORC:结构化播客开放研究语料库(版本 1.0)。SPORC是一个用于研究播客生态系统的庞大多模态数据集。我们的数据中包含了播客元数据、文本记录、发言轮次标签、发言人角色标签以及发言人音频特征。关于数据的收集与处理,以及播客生态系统的初步分析,请参阅我们的论文或我们的GitHub仓库以获取分析和数据处理的相关信息。欲查看该数据集的完整描述,请访问:https://huggingface.co/datasets/blitt/SPoRC。
提供机构:
huggingface.co



