CyberCan lexicon
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://doi.org/10.7910/DVN/N6BRRS
下载链接
链接失效反馈官方服务:
资源简介:
Text mining has been a dominant approach to extracting useful information from massive unstructured data online. But existing tools for Chinese word segmentation are not ideal for processing social media text data in Cantonese. This project developed CyberCan, a lexicon of contemporary Cantonese based on more than 100 million pieces of internet texts. The details regarding the creation of the lexicon could be found here: https://osf.io/preprints/socarxiv/tyjr7
创建时间:
2022-09-16



