mteb/CodeSearchNet-ccr
收藏Hugging Face2024-08-05 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/mteb/CodeSearchNet-ccr
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含多个编程语言(如Go、Java、JavaScript、PHP、Python和Ruby)的语料库、查询和相关度评分数据。每个语言的配置包括语料库、查询和相关度评分三个部分。语料库数据包含ID、标题、分区、文本、语言和元信息等字段;查询数据包含类似的字段;相关度评分数据包含查询ID、语料库ID和评分。数据集通常分为训练集、验证集和测试集。
This dataset contains corpus, queries, and relevance scores for multiple programming languages such as Go, Java, JavaScript, PHP, Python, and Ruby. Each language configuration includes three parts: corpus, queries, and relevance scores. The corpus data includes fields such as ID, title, partition, text, language, and meta-information; the query data includes similar fields; the relevance score data includes query ID, corpus ID, and score. The dataset is typically divided into training, validation, and test sets.
提供机构:
mteb



