five

Pranavz/emilia-en-mimi-q8-s4096-smoke-20260326e

收藏
Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Pranavz/emilia-en-mimi-q8-s4096-smoke-20260326e
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: emilia-en-mimi-q8-s4096-smoke-20260326e language: - en task_categories: - text-to-speech size_categories: - n<10K --- # emilia-en-mimi-q8-s4096-smoke-20260326e Frozen pretokenized Emilia-English model-ready dataset for TinyAya + Mimi training. ## Layout - `train/lang=en/*.parquet` - optional `validation/lang=en/*.parquet` - optional `test/lang=en/*.parquet` - `dataset_manifest.json` ## Selection - source dataset: `amphion/Emilia-Dataset` - data files: `Emilia/EN/EN-B000000.tar` - source split: `train` - quantizers: `8` - train samples: `8` - validation samples: `0` - test samples: `0` - min seconds: `1.0` - max seconds: `30.0` ## Audio Codec - backend: `mimi` - source: `hf_pretrained` - model: `kyutai/mimi` - sample rate: `24000` ## Notes This repo stores pretokenized training artifacts, not raw audio. Use `dataset_manifest.json` as the immutable split fingerprint for ablation reproducibility.
提供机构:
Pranavz
二维码
社区交流群
二维码
科研交流群
商业服务