LCQMC

Name: LCQMC
Creator: maas
Published: 2026-05-20 16:15:33
License: 暂无描述

魔搭社区2026-05-20 更新2024-05-15 收录

下载链接：

https://modelscope.cn/datasets/C-MTEB/LCQMC

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for "LCQMC" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)

displayName: LCQMC (Large-scale Chinese Question Matching Corpus) labelTypes: - Chinese Corpus license: - LCQMC Custom paperUrl: https://aclanthology.org/C18-1166.pdf publishDate: "2018-06-06" publishUrl: http://icrc.hitsz.edu.cn/info/1037/1146.htm publisher: - Harbin Institute of Technology - Alibaba tags: - Chinese --- # Dataset Introduction ## Overview Question matching is a fundamental task in QA, which is generally considered a semantic matching task and sometimes a paraphrase recognition task. The goal of this task is to retrieve questions with similar intent to the input query from existing databases. We introduce a large-scale Chinese question matching corpus named LCQMC. Unlike paraphrase corpora, LCQMC is more general as it focuses on intent matching rather than paraphrase. The corpus contains 260,068 manually annotated question pairs, which are split into three subsets: a training set with 238,766 question pairs, a development set with 8,802 question pairs, and a test set with 12,500 question pairs. We evaluated several state-of-the-art sentence matching methods on this corpus. The experimental results not only verify the excellent quality of LCQMC but also provide reliable baseline performance for further research on this corpus. ## Download Dataset :modelscope-code[]{type="git"}

提供机构：

maas

创建时间：

2024-09-06

搜集汇总

数据集介绍