Chinese Vocabulary Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://pslcdatashop.web.cmu.edu/DatasetInfo?datasetId=213
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了1105个独特的汉字,并提供了对应的英文和拼音响应,用于表示学习。此外,为了表示学习,这些汉字被转换成了16x16像素的图像。该数据集的规模涉及94名学生,共计61,323条学生与项目的交互记录。其任务是学习汉字的表示。
This dataset contains 1,105 unique Chinese characters, paired with their corresponding English explanations and Pinyin transcriptions for representation learning. Furthermore, these characters have been converted into 16×16 pixel images for the same learning objective. The dataset involves 94 students and a total of 61,323 student-project interaction records. The core task of this dataset is to learn the representations of Chinese characters.



