RavdessZeroshot
收藏RavdessZeroshot 数据集概述
数据集基本信息
- 数据集名称: RavdessZeroshot
- 创建者: 人工标注
- 语言: 英语
- 许可证: CC-BY-NC-SA 3.0
- 多语言性: 单语言
- 任务类别: 其他、文本到音频
- 任务ID: 情感分类
- 标签: mteb、audio、text
数据集来源
- 源数据集: narad/ravdess
- 参考链接: https://huggingface.co/datasets/narad/ravdess
数据集描述
RavdessZeroshot是一个情感分类数据集。RAVDESS包含24位专业演员(12位女性,12位男性),以中性的北美口音说出两个词汇匹配的语句。语音情感包括中性、平静、快乐、悲伤、愤怒、恐惧、惊讶和厌恶表达。这8种情感也作为数据集的标签。
数据集配置与结构
配置1: default
- 特征:
- audio: 音频
- labels: int64
- 数据分割:
- train: 1440个样本,511660544字节
- 下载大小: 312073871字节
- 数据集大小: 511660544字节
- 数据文件路径: data/train-*
配置2: labels
- 特征:
- labels: string
- 数据分割:
- train: 8个样本,263字节
- 下载大小: 915字节
- 数据集大小: 263字节
- 数据文件路径: labels/train-*
评估方法
可使用MTEB库评估嵌入模型在该数据集上的性能。示例代码: python import mteb task = mteb.get_task("RavdessZeroshot") evaluator = mteb.MTEB([task]) model = mteb.get_model(YOUR_MODEL) evaluator.run(model)
引用信息
源数据集引用
bibtex @article{10.1371/journal.pone.0196391, author = {Livingstone, Steven R. AND Russo, Frank A.}, doi = {10.1371/journal.pone.0196391}, journal = {PLOS ONE}, month = {05}, number = {5}, pages = {1-35}, publisher = {Public Library of Science}, title = {The Ryerson Audio-Visual Database ofal Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English}, url = {https://doi.org/10.1371/journal.pone.0196391}, volume = {13}, year = {2018}, }
MTEB相关引用
bibtex @article{enevoldsen2025mmtebmassivemultilingualtext, title={MMTEB: Massive Multilingual Text Embedding Benchmark}, author={Kenneth Enevoldsen and Isaac Chung and Imene Kerboua and Márton Kardos and Ashwin Mathur and David Stap and Jay Gala and Wissam Siblini and Dominik Krzemiński and Genta Indra Winata and Saba Sturua and Saiteja Utpala and Mathieu Ciancone and Marion Schaeffer and Gabriel Sequeira and Diganta Misra and Shreeya Dhakal and Jonathan Rystrøm and Roman Solomatin and Ömer Çağatan and Akash Kundu and Martin Bernstorff and Shitao Xiao and Akshita Sukhlecha and Bhavish Pahwa and Rafał Poświata and Kranthi Kiran GV and Shawon Ashraf and Daniel Auras and Björn Plüster and Jan Philipp Harries and Loïc Magne and Isabelle Mohr and Mariya Hendriksen and Dawei Zhu and Hippolyte Gisserot-Boukhlef and Tom Aarsen and Jan Kostkan and Konrad Wojtasik and Taemin Lee and Marek Šuppa and Crystina Zhang and Roberta Rocca and Mohammed Hamdy and Andrianos Michail and John Yang and Manuel Faysse and Aleksei Vatolin and Nandan Thakur and Manan Dey and Dipam Vasani and Pranjal Chitale and Simone Tedeschi and Nguyen Tai and Artem Snegirev and Michael Günther and Mengzhou Xia and Weijia Shi and Xing Han Lù and Jordan Clive and Gayatri Krishnakumar and Anna Maksimova and Silvan Wehrli and Maria Tikhonova and Henil Panchal and Aleksandr Abramov and Malte Ostendorff and Zheng Liu and Simon Clematide and Lester James Miranda and Alena Fenogenova and Guangyu Song and Ruqiya Bin Safi and Wen-Ding Li and Alessia Borghini and Federico Cassano and Hongjin Su and Jimmy Lin and Howard Yen and Lasse Hansen and Sara Hooker and Chenghao Xiao and Vaibhav Adlakha and Orion Weller and Siva Reddy and Niklas Muennighoff}, publisher = {arXiv}, journal={arXiv preprint arXiv:2502.13595}, year={2025}, url={https://arxiv.org/abs/2502.13595}, doi = {10.48550/arXiv.2502.13595}, }
@article{muennighoff2022mteb, author = {Muennighoff, Niklas and Tazi, Nouamane and Magne, Loïc and Reimers, Nils}, title = {MTEB: Massive Text Embedding Benchmark}, publisher = {arXiv}, journal={arXiv preprint arXiv:2210.07316}, year = {2022} url = {https://arxiv.org/abs/2210.07316}, doi = {10.48550/ARXIV.2210.07316}, }




