LCQMC
收藏arXiv2025-09-30 收录
下载链接:
http://icrc.hitsz.edu.cn/info/1037/1146.htm
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个大规模的开源中文问题匹配语料库,它由百度知道构建而成。数据的划分遵循了原始论文中的设置,并将句子长度限制为50个字符。该数据集的规模属于大型,所针对的任务是问题匹配。
This dataset is a large-scale open-source Chinese question matching corpus constructed from Baidu Zhidao. The data split follows the settings in the original paper, with the sentence length limited to 50 characters. This dataset is tailored for the question matching task.
提供机构:
Baidu Knows



