five

MikhailT/speaker-embeddings

收藏
Hugging Face2023-09-22 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/MikhailT/speaker-embeddings
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: speakers version: 1.0.0 data_files: data/speakers.jsonl - config_name: models version: 1.0.0 data_files: data/models.jsonl - config_name: datasets version: 1.0.0 data_files: data/datasets.jsonl - config_name: dataset_utterances version: 1.0.0 data_files: - split: aru path: data/aru/utterances*.jsonl - split: cmu_arctic path: data/cmu_arctic/utterances*.jsonl - config_name: utterance_embeddings version: 1.0.0 data_files: - split: aru path: data/aru/utterance_embeddings*.jsonl - split: cmu_arctic path: data/cmu_arctic/utterance_embeddings*.jsonl - config_name: speaker_embeddings version: 1.0.0 data_files: - split: speechbrain_spkrec_xvect_voxceleb path: data/*/speaker_embeddings_*001.jsonl - split: speechbrain_spkrec_ecapa_voxceleb path: data/*/speaker_embeddings_*002.jsonl - split: speechbrain_spkrec_xvect_voxceleb_mean path: data/*/speaker_embeddings_mean_001.jsonl - split: speechbrain_spkrec_ecapa_voxceleb_mean path: data/*/speaker_embeddings_mean_002.jsonl - split: speechbrain_spkrec_xvect_voxceleb_sets path: data/*/speaker_embeddings_sets_001.jsonl - split: speechbrain_spkrec_ecapa_voxceleb_sets path: data/*/speaker_embeddings_sets_002.jsonl dataset_info: - config_name: speakers features: - name: id dtype: string - name: name dtype: string - name: lang dtype: string - name: sex dtype: string - name: age dtype: int32 - name: country dtype: string - name: accent dtype: string - config_name: models features: - name: id dtype: string - name: name dtype: string - name: size dtype: int32 - name: sample_rate dtype: int32 - config_name: datasets features: - name: id dtype: string - name: name dtype: string - name: sample_rate dtype: int32 - config_name: dataset_utterances features: - name: id dtype: string - name: name dtype: string - name: duration dtype: float32 - name: speaker_id dtype: string - name: dataset_id dtype: string - config_name: utterance_embeddings features: - name: speaker_id dtype: string - name: file_id dtype: string - name: dataset_id dtype: string - name: model_id dtype: string - name: embedding sequence: float32 - config_name: speaker_embeddings features: - name: speaker_id dtype: string - name: model_id dtype: string - name: set dtype: string - name: embedding sequence: float32 pretty_name: Speaker Embeddings ---
提供机构:
MikhailT
原始信息汇总

数据集概述

数据集配置

  • speakers

    • 版本: 1.0.0
    • 数据文件: data/speakers.jsonl
    • 特征:
      • id: 字符串
      • name: 字符串
      • lang: 字符串
      • sex: 字符串
      • age: 整数
      • country: 字符串
      • accent: 字符串
  • models

    • 版本: 1.0.0
    • 数据文件: data/models.jsonl
    • 特征:
      • id: 字符串
      • name: 字符串
      • size: 整数
      • sample_rate: 整数
  • datasets

    • 版本: 1.0.0
    • 数据文件: data/datasets.jsonl
    • 特征:
      • id: 字符串
      • name: 字符串
      • sample_rate: 整数
  • dataset_utterances

    • 版本: 1.0.0
    • 数据文件:
      • aru: data/aru/utterances*.jsonl
      • cmu_arctic: data/cmu_arctic/utterances*.jsonl
    • 特征:
      • id: 字符串
      • name: 字符串
      • duration: 浮点数
      • speaker_id: 字符串
      • dataset_id: 字符串
  • utterance_embeddings

    • 版本: 1.0.0
    • 数据文件:
      • aru: data/aru/utterance_embeddings*.jsonl
      • cmu_arctic: data/cmu_arctic/utterance_embeddings*.jsonl
    • 特征:
      • speaker_id: 字符串
      • file_id: 字符串
      • dataset_id: 字符串
      • model_id: 字符串
      • embedding: 浮点数序列
  • speaker_embeddings

    • 版本: 1.0.0
    • 数据文件:
      • speechbrain_spkrec_xvect_voxceleb: data/*/speaker_embeddings_*001.jsonl
      • speechbrain_spkrec_ecapa_voxceleb: data/*/speaker_embeddings_*002.jsonl
      • speechbrain_spkrec_xvect_voxceleb_mean: data/*/speaker_embeddings_mean_001.jsonl
      • speechbrain_spkrec_ecapa_voxceleb_mean: data/*/speaker_embeddings_mean_002.jsonl
      • speechbrain_spkrec_xvect_voxceleb_sets: data/*/speaker_embeddings_sets_001.jsonl
      • speechbrain_spkrec_ecapa_voxceleb_sets: data/*/speaker_embeddings_sets_002.jsonl
    • 特征:
      • speaker_id: 字符串
      • model_id: 字符串
      • set: 字符串
      • embedding: 浮点数序列

数据集名称

  • pretty_name: Speaker Embeddings
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作