Cornell-rich
收藏arXiv2025-09-30 收录
下载链接:
https://convokit.cornell.edu/documentation/movie.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为“Cornell-rich”,它为Cornell电影对话语料库提供了注释,特别关注了特色说话角色的独特特征,并附有电影元数据。该数据集不仅包含了丰富的手动注释,还自动收集了电影的元数据,如年龄范围、职业和角色描述。其规模涵盖了863位说话者和135.7万个话语。这项工作的任务是进行银幕角色的个性化语言建模。
This dataset, named "Cornell-rich", annotates the Cornell Movie-Dialogs Corpus, with a specific focus on the unique characteristics of featured speaking characters and is accompanied by movie metadata. In addition to rich manual annotations, this dataset also automatically collects movie metadata including age ranges, occupations, and character descriptions. It covers 863 speakers and 1.357 million utterances. The core task of this work is personalized language modeling for on-screen characters.
提供机构:
Authors of the paper



