Bangla Voice Dataset: Simple, Complex, and Compound Structures
收藏DataCite Commons2025-05-01 更新2025-05-17 收录
下载链接:
https://data.mendeley.com/datasets/2wn7c48dtp
下载链接
链接失效反馈官方服务:
资源简介:
The dataset is a comprehensive resource designed for linguistic analysis, natural language processing (NLP), and speech recognition tasks specifically tailored for the Bangla language. It comprises the following key features:
Textual Data:
Sentence Types: The corpus includes a balanced collection of simple, complex, and compound sentences, carefully curated to represent diverse syntactic structures and real-world language usage in Bangla.
Diversity: Sentences cover a wide range of topics and contexts, ensuring linguistic richness and variety.
Voice Data:
Audio Recordings: Each sentence is paired with high-quality voice recordings by native Bangla speakers, ensuring accurate pronunciation, intonation, and regional linguistic nuances.
Annotation:
Sentence Labeling: Each sentence is tagged as simple, complex, or compound, aiding in syntactic analysis and supervised learning applications.
Applications:
Speech Recognition and Synthesis: Ideal for training and evaluating speech-to-text and text-to-speech systems for Bangla.
Language Modeling: Supports NLP tasks such as machine translation, sentiment analysis, and syntactic parsing.
Educational Use: Useful for linguistic research, Bangla grammar teaching, and phonetic studies.
Compliance: The dataset adheres to ethical guidelines, ensuring informed consent from all contributors.
This dataset serves as a valuable asset for researchers, developers, and educators seeking to advance technologies and studies involving the Bangla language.
提供机构:
Mendeley Data
创建时间:
2024-12-09



