NandemoGHS/Japanese-Eroge-Voice-V2
收藏Hugging Face2026-01-15 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/NandemoGHS/Japanese-Eroge-Voice-V2
下载链接
链接失效反馈官方服务:
资源简介:
这是[Japanese-Eroge-Voice](NandemoGHS/Japanese-Eroge-Voice)数据集的后续版本,包含从日本成人游戏(eroge)中提取的更大规模的音频-转录对。此版本(V2)与前一版本无任何重叠,所有音频片段和转录均为全新数据。总音频时长超过2,600小时,为语音合成和识别研究提供了更丰富的资源。数据集包含匿名化的元数据(如`source_dataset`、`scene_id`、`char_id`通过哈希处理)和随机打乱的数据,以符合日本版权法。转录来源包括游戏原始脚本(`original`)和AI生成(`asr`)。数据集存在NSFW内容、女性声音占主导的性别偏差以及潜在的转录错误。采用MIT许可证,仅限教育和学术研究使用。
This is the successor to the [Japanese-Eroge-Voice](NandemoGHS/Japanese-Eroge-Voice) dataset. It consists of a significantly larger collection of audio-transcription pairs extracted from Japanese eroge (adult games). This version (V2) has no overlap with the previous version, providing entirely new data. The total audio duration exceeds 2,600 hours, offering a more robust resource for speech synthesis and recognition research. The dataset includes anonymized metadata (e.g., hashed `source_dataset`, `scene_id`, `char_id`) and shuffled data to comply with Japanese copyright law. Transcriptions are sourced from original game scripts (`original`) or generated by AI (`asr`). The dataset contains NSFW content, exhibits gender bias towards female voices, and may have transcription errors. Licensed under MIT, intended for educational and academic research.
提供机构:
NandemoGHS



