five

typhoon-ai/isan-phonetic-dictionary

收藏
Hugging Face2025-11-26 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/typhoon-ai/isan-phonetic-dictionary
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - th tags: - phonetics - isan - thai-dialect - linguistics - ipa license: apache-2.0 size_categories: - 1K<n<10K pretty_name: Isan Phonetic Dictionary --- # Isan Phonetic Dictionary Dataset ## Dataset Description - **Homepage:** [opentyphoon.ai] - **Point of Contact:** [contact@opentyphoon.ai] ### Summary This dataset is a phonetic dictionary focused on **Isan (Northeastern Thai)** pronunciations. It is structured to handle linguistic complexities such as: 1. **Phonetic Variations (เสียงแปร):** Words that have multiple valid pronunciations without changing the meaning. 2. **Homographs (คำพ้องรูป):** Words that are spelled the same but have different pronunciations and meanings depending on the context. The data is provided in **TSV (Tab-Separated Values)** format. ### Supported Tasks - **Grapheme-to-Phoneme (G2P):** Converting Thai script to phonetic transcriptions (IPA). - **Dialect Analysis:** Studying phonological differences between Standard Thai and Isan. - **Speech Synthesis (TTS):** Providing accurate pronunciation data for Isan dialect synthesis. ## Dataset Structure ### Data Instances A typical instance in the dataset represents a word, its primary transcription, potential variations, and context if applicable. | word | phonetic_transcription | variation | homograph_context | | :--- | :--- | :--- | :--- | | รัก | h a k ˦ | l a k ˦ | | | ย่าง | j aː ŋ ˧˩ | | วิธีทำอาหารแบบหนึ่ง เช่น ไก่ย่าง | | ย่าง | ɲ aː ŋ ˦ | | ย่างเดิน |
提供机构:
typhoon-ai
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作