INESCO Dataset: Indonesian Expressive Speech Corpus for Emotion Speech Synthesis

Mendeley Data2026-07-04 收录

下载链接：

https://data.mendeley.com/datasets/hzrznx3xs5

下载链接

链接失效反馈

官方服务：

资源简介：

INESCO dataset is Indonesian Expressive Speech Corpus. The dataset consists of Indonesian-language voice emotion data (happiness, anger, and sadness) and includes an Indonesian speech database and its transcription. The sentences used to develop the Indonesian expressive speech database were selected based on phonetic balance criteria that conform to the International Phonetic Alphabet (IPA) standard. This sentence database consists of 600 sentences, including three sentence lengths: short, medium, and long. Short sentences consist of 1 or 2 words, medium sentences consist of 3 to 7 words, and long sentences consist of 8 to 11 words. All sentences in the Indonesian expressive speech corpus are saved in Excel files, and the recorded audio for each sentence is saved in a .wav file. This Indonesian expressive speech corpus was recorded in a studio or soundproof room with separate recording and control rooms to minimize background noise. The Indonesian expressive speech corpus was recorded by four professional speakers, two female and two male. The selected voice artists are professional theater artists who are members of the Indonesian Arts Council, namely Teater Api from Surabaya, East Java, and Teater Payung Hitam from Bandung, West Java. The voice artist was selected to record the corpus based on the criteria of being allowed to have a regional accent and having more than five years of experience as a theater artist. INESCO dataset contains 2,400 audio files (600 × 4 speakers), totaling 1.33GB. The total duration of this Indonesian expressive speech corpus is 3 hours, 1 minutes, and 49 seconds, which includes approximately 1 hour, 30 minutes, and 2 seconds of male voice and approximately 1 hour, 32 minutes, and 47 seconds of female voice. The database developed in this study contains comprehensive, carefully curated data to support research in speech synthesis, speech recognition, and other areas of natural language processing.

创建时间：

2026-06-11