NandemoGHS/Japanese-Eroge-Voice
收藏Hugging Face2025-08-31 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/NandemoGHS/Japanese-Eroge-Voice
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-to-speech
- automatic-speech-recognition
language:
- ja
tags:
- speech
- audio
- japanese
- asmr
- anime
- voice
pretty_name: Japanese-Eroge-Voice
size_categories:
- 100K<n<1M
---
# Japanese-Eroge-Voice
## Description
This dataset contains pairs of audio data and corresponding transcriptions extracted from Japanese eroge (adult games) that I have personally purchased. The transcriptions are generated using the **[litagin/anime-whisper](https://huggingface.co/litagin/anime-whisper)** model.
-----
## Preprocessing Steps
The raw audio data has undergone the following preprocessing steps:
1. **Loudness Normalization**:
Audio loudness is normalized using **ffmpeg's 2-pass `loudnorm` filter** to target parameters of **-23.0 LUFS integrated loudness, -1.0 dB true peak, and 11.0 LU loudness range (LRA)**.
3. **Transcription**:
Each clip is transcribed with **[litagin/anime-whisper](https://huggingface.co/litagin/anime-whisper)** model.
4. **Data Shuffling, Anonymization, and WebDataset Conversion**:
The processed data is shuffled, and unique identifiers (UIDs) are **hashed for anonymization**. The data is then packaged into **[WebDataset](https://github.com/webdataset/webdataset)** format. Due to the shuffling and anonymization, it is difficult to reconstruct the original works in their entirety, **aiming to limit the enjoyment of the original copyrighted works under Japanese copyright law.**
-----
## Dataset Format
This dataset is provided in **WebDataset** format. Each `.tar` shard typically contains 1024 clips (the final shard may contain fewer). Every clip consists of three files sharing the same base filename:
```
0a1be9a22ae956c9.flac (FLAC audio file)
0a1be9a22ae956c9.json (JSON metadata file)
0a1be9a22ae956c9.txt (Text transcription file)
...
```
-----
## Dataset Statistics
- **Total duration:** 409.33 hours
-----
## Biases and Limitations
* **NSFW Content**: This dataset is derived from adult games and contains a significant amount of Not-Safe-For-Work (NSFW) content.
* **Gender Bias**: Due to the nature of the source material, the dataset is heavily skewed towards female voices.
* **Potential Transcription Errors**: Transcriptions are generated automatically by an AI model and have not been manually verified. They are likely to contain errors and inaccuracies.
-----
## License
This dataset is licensed under the **[MIT License](https://choosealicense.com/licenses/mit/)**.
**Intended use** — This dataset is primarily designed for **educational and academic research. All use is at your own risk, and you must ensure compliance with applicable law.**
**NO WARRANTY** – This dataset is provided “as is” without any express or implied warranty.
## How to Cite
```bibtex
@misc{Japanese-Eroge-Voice,
title = {Japanese-Eroge-Voice},
author = {OmniAICreator},
year = {2025},
howpublished = {\url{https://huggingface.co/datasets/NandemoGHS/Japanese-Eroge-Voice}},
}
```
提供机构:
NandemoGHS



