eduhk-compling/GroupE_groupproject_DayoWong_StandUp_Comedy
收藏Hugging Face2026-04-25 更新2026-05-10 收录
下载链接:
https://hf-mirror.com/datasets/eduhk-compling/GroupE_groupproject_DayoWong_StandUp_Comedy
下载链接
链接失效反馈官方服务:
资源简介:
该语料库标注了黄子华1999年单口喜剧表演(“黃子華 Dayo 1999 棟篤笑 拾下拾下”)中每句话的最后一个粤语词以及笑话的结构。如果最后一个词是英文,则被排除,并使用</na>标签。语料库标注了音高(1-5级),基于赵元任的粤语音系系统(Chao, Y.-R. (1947), Cantonese Primer)。其他标注类别包括:音调变化(</rising>上升、</level>水平、</falling>下降),描述声音音高的移动;持续时间(</elongation>延长、</truncation>截断、</nochange>无变化),描述录音中声音的时长;笑话结构(</setup>铺垫、</misdirection>误导、</punchline>笑点、</tag>附加、</transition>过渡),描述句子在笑话中的部分;标注者姓名(</name>)以及两位检查者(</initials1>和</initials2>)。
This corpus annotates the last Cantonese word in every sentence of Dayo Wongs 1999 stand-up set (“黃子華 Dayo 1999 棟篤笑 拾下拾下”), as well as the structure of the jokes. This corpus excludes the last word if it is in English, using </na> for relevant tags. The corpus is annotated with pitch(1-5), using the tone system of numbers based on: Chao, Y.-R. (1947), Cantonese Primer (Cambridge, Mass.: Harvard University Press). The other categories of tags are as follows: tones </rising> </level> </falling>: Which describes the movement of the voice pitch. duration </elongation> </truncation> </nochange>: Which describes the duration of the sound in the recording. Structure of joke </setup> </misdirection> </punchline> </tag> </transition>: Which describes the part of the joke the sentence is. pic </name>: Which is the person responsible for annotating the segment. check1 </initials1>: Which is the first person checking the segment. check2 </initials2>: Which is the second person checking the segment.
提供机构:
eduhk-compling



