typhoon-ai/isan-phonetic-dictionary
收藏Hugging Face2025-11-26 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/typhoon-ai/isan-phonetic-dictionary
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- th
tags:
- phonetics
- isan
- thai-dialect
- linguistics
- ipa
license: apache-2.0
size_categories:
- 1K<n<10K
pretty_name: Isan Phonetic Dictionary
---
# Isan Phonetic Dictionary Dataset
## Dataset Description
- **Homepage:** [opentyphoon.ai]
- **Point of Contact:** [contact@opentyphoon.ai]
### Summary
This dataset is a phonetic dictionary focused on **Isan (Northeastern Thai)** pronunciations. It is structured to handle linguistic complexities such as:
1. **Phonetic Variations (เสียงแปร):** Words that have multiple valid pronunciations without changing the meaning.
2. **Homographs (คำพ้องรูป):** Words that are spelled the same but have different pronunciations and meanings depending on the context.
The data is provided in **TSV (Tab-Separated Values)** format.
### Supported Tasks
- **Grapheme-to-Phoneme (G2P):** Converting Thai script to phonetic transcriptions (IPA).
- **Dialect Analysis:** Studying phonological differences between Standard Thai and Isan.
- **Speech Synthesis (TTS):** Providing accurate pronunciation data for Isan dialect synthesis.
## Dataset Structure
### Data Instances
A typical instance in the dataset represents a word, its primary transcription, potential variations, and context if applicable.
| word | phonetic_transcription | variation | homograph_context |
| :--- | :--- | :--- | :--- |
| รัก | h a k ˦ | l a k ˦ | |
| ย่าง | j aː ŋ ˧˩ | | วิธีทำอาหารแบบหนึ่ง เช่น ไก่ย่าง |
| ย่าง | ɲ aː ŋ ˦ | | ย่างเดิน |
提供机构:
typhoon-ai



