NandemoGHS/Japanese-Eroge-Voice

Name: NandemoGHS/Japanese-Eroge-Voice
Creator: NandemoGHS
Published: 2025-08-31 15:17:43
License: 暂无描述

Hugging Face2025-08-31 更新2026-01-03 收录

下载链接：

https://hf-mirror.com/datasets/NandemoGHS/Japanese-Eroge-Voice

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit task_categories: - text-to-speech - automatic-speech-recognition language: - ja tags: - speech - audio - japanese - asmr - anime - voice pretty_name: Japanese-Eroge-Voice size_categories: - 100K<n<1M --- # Japanese-Eroge-Voice ## Description This dataset contains pairs of audio data and corresponding transcriptions extracted from Japanese eroge (adult games) that I have personally purchased. The transcriptions are generated using the **[litagin/anime-whisper](https://huggingface.co/litagin/anime-whisper)** model. ----- ## Preprocessing Steps The raw audio data has undergone the following preprocessing steps: 1. **Loudness Normalization**: Audio loudness is normalized using **ffmpeg's 2-pass `loudnorm` filter** to target parameters of **-23.0 LUFS integrated loudness, -1.0 dB true peak, and 11.0 LU loudness range (LRA)**. 3. **Transcription**: Each clip is transcribed with **[litagin/anime-whisper](https://huggingface.co/litagin/anime-whisper)** model. 4. **Data Shuffling, Anonymization, and WebDataset Conversion**: The processed data is shuffled, and unique identifiers (UIDs) are **hashed for anonymization**. The data is then packaged into **[WebDataset](https://github.com/webdataset/webdataset)** format. Due to the shuffling and anonymization, it is difficult to reconstruct the original works in their entirety, **aiming to limit the enjoyment of the original copyrighted works under Japanese copyright law.** ----- ## Dataset Format This dataset is provided in **WebDataset** format. Each `.tar` shard typically contains 1024 clips (the final shard may contain fewer). Every clip consists of three files sharing the same base filename: ``` 0a1be9a22ae956c9.flac (FLAC audio file) 0a1be9a22ae956c9.json (JSON metadata file) 0a1be9a22ae956c9.txt (Text transcription file) ... ``` ----- ## Dataset Statistics - **Total duration:** 409.33 hours ----- ## Biases and Limitations * **NSFW Content**: This dataset is derived from adult games and contains a significant amount of Not-Safe-For-Work (NSFW) content. * **Gender Bias**: Due to the nature of the source material, the dataset is heavily skewed towards female voices. * **Potential Transcription Errors**: Transcriptions are generated automatically by an AI model and have not been manually verified. They are likely to contain errors and inaccuracies. ----- ## License This dataset is licensed under the **[MIT License](https://choosealicense.com/licenses/mit/)**. **Intended use** — This dataset is primarily designed for **educational and academic research. All use is at your own risk, and you must ensure compliance with applicable law.** **NO WARRANTY** – This dataset is provided “as is” without any express or implied warranty. ## How to Cite ```bibtex @misc{Japanese-Eroge-Voice, title = {Japanese-Eroge-Voice}, author = {OmniAICreator}, year = {2025}, howpublished = {\url{https://huggingface.co/datasets/NandemoGHS/Japanese-Eroge-Voice}}, } ```

提供机构：

NandemoGHS

5,000+

优质数据集

54 个

任务类型

进入经典数据集