codesearchnet
收藏魔搭社区2025-11-27 更新2025-01-11 收录
下载链接:
https://modelscope.cn/datasets/sentence-transformers/codesearchnet
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for CodeSearchNet
This dataset is a collection of comment-code pairs of various programming languages. See [code_search_net](https://huggingface.co/datasets/code_search_net) for additional information.
This dataset can be used directly with Sentence Transformers to train embedding models.
## Dataset Subsets
### `pair` subset
* Columns: "comment", "code"
* Column types: `str`, `str`
* Examples:
```python
{
'comment': 'Computes the new parent id for the node being moved.\n\n@return int',
'code': "protected function parentId()\n\t{\n\t\tswitch ( $this->position )\n\t\t{\n\t\t\tcase 'root':\n\t\t\t\treturn null;\n\n\t\t\tcase 'child':\n\t\t\t\treturn $this->target->getKey();\n\n\t\t\tdefault:\n\t\t\t\treturn $this->target->getParentId();\n\t\t}\n\t}",
}
```
* Collection strategy: Reading the Code Search Net dataset from [embedding-training-data](https://huggingface.co/datasets/sentence-transformers/embedding-training-data).
* Deduplified: No
# CodeSearchNet 数据集卡片
本数据集为多种编程语言的注释-代码对样本集合,更多详细信息可访问 [code_search_net](https://huggingface.co/datasets/code_search_net) 查阅。
该数据集可直接配合Sentence Transformers开展嵌入模型的训练工作。
## 数据集子集
### `pair` 子集
* 列名:"comment"、"code"
* 列类型:均为字符串类型(str)
* 示例:
python
{
'comment': '为待移动节点计算新的父节点ID。
@return int',
'code': "protected function parentId()
{
switch ( $this->position )
{
case 'root':
return null;
case 'child':
return $this->target->getKey();
default:
return $this->target->getParentId();
}
}"
}
* 数据采集方式:从 [embedding-training-data](https://huggingface.co/datasets/sentence-transformers/embedding-training-data) 加载CodeSearchNet数据集。
* 去重处理:未执行
提供机构:
maas
创建时间:
2025-01-06



