Token Tagging Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/mhagglun/google-research-cuBERT
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是从Kanade等人(2020年)的Python数据集中筛选出来的,专注于标记级别的标注任务。它包含了从CodeSearchNet数据集中过滤出的样本,这些样本是使用Python的tokenize模块进行注释的。该数据集的主要任务是进行标记标注。
This dataset is curated from the Python dataset released by Kanade et al. (2020), with a primary focus on token-level annotation tasks. It consists of samples filtered from the CodeSearchNet dataset, where each sample is annotated via Python's built-in tokenize module. The core task of this dataset is token-level annotation.



