Sampling-Multitask-National-Speech-Corpus-v1
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/mesolitica/Sampling-Multitask-National-Speech-Corpus-v1
下载链接
链接失效反馈官方服务:
资源简介:
# Sampling Multitask-National-Speech-Corpus-v1
Original dataset from https://huggingface.co/datasets/MERaLiON/Multitask-National-Speech-Corpus-v1, we only take Part 3 and do sampling.
## how to prepare the dataset
```bash
huggingface-cli download \
mesolitica/Sampling-Multitask-National-Speech-Corpus-v1 \
--include "*.zip" \
--repo-type "dataset" \
--local-dir './'
wget https://gist.githubusercontent.com/huseinzol05/2e26de4f3b29d99e993b349864ab6c10/raw/9b2251f3ff958770215d70c8d82d311f82791b78/unzip.py
python3 unzip.py
```
# 采样版多任务国家语料库v1
本数据集源自 https://huggingface.co/datasets/MERaLiON/Multitask-National-Speech-Corpus-v1,我们仅选取其第三部分并进行采样处理。
## 数据集准备步骤
bash
huggingface-cli download
mesolitica/Sampling-Multitask-National-Speech-Corpus-v1
--include "*.zip"
--repo-type "dataset"
--local-dir './'
wget https://gist.githubusercontent.com/huseinzol05/2e26de4f3b29d99e993b349864ab6c10/raw/9b2251f3ff958770215d70c8d82d311f82791b78/unzip.py
python3 unzip.py
提供机构:
maas
创建时间:
2025-10-04



