LibrispeechMaleFemale in WebDataset Format
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14641592
下载链接
链接失效反馈官方服务:
资源简介:
This dataset is the LibriSpeech dataset with the splits {train-clean-100, dev-clean, test-clean}, formatted in the WebDataset format. WebDataset files are essentially tar archives, where each example in the dataset is represented by a pair of files: a WAV audio file and a corresponding JSON metadata file. The JSON file contains the class label and other relevant information for that particular audio sample.
$ tar tvf train-clean-100-000000.tar | head-r--r--r-- bigdata/bigdata 203670 2025-01-14 04:25 5561-39621-0030.flac-r--r--r-- bigdata/bigdata 219 2025-01-14 04:25 5561-39621-0030.json-r--r--r-- bigdata/bigdata 361438 2025-01-14 04:25 5561-39621-0043.flac-r--r--r-- bigdata/bigdata 311 2025-01-14 04:25 5561-39621-0043.json-r--r--r-- bigdata/bigdata 57964 2025-01-14 04:25 5561-39621-0000.flac-r--r--r-- bigdata/bigdata 64 2025-01-14 04:25 5561-39621-0000.json-r--r--r-- bigdata/bigdata 190735 2025-01-14 04:25 5561-39621-0028.flac-r--r--r-- bigdata/bigdata 190 2025-01-14 04:25 5561-39621-0028.json-r--r--r-- bigdata/bigdata 58973 2025-01-14 04:25 5561-39621-0037.flac-r--r--r-- bigdata/bigdata 86 2025-01-14 04:25 5561-39621-0037.json
$ cat 5561-39621-0030.json | jq .{ "speaker": 5561, "gender": "F", "trans": "THEN THE GENERAL SENT ME BACK THE LETTER BY AN AIDE DE CAMP INFORMING ME THAT IF I WERE FOUND THE NEXT DAY WITHIN THE CIRCUMSCRIPTION OF HIS COMMAND HE WOULD HAVE ME ARRESTED"}
创建时间:
2025-01-24



