sentence-transformers/codesearchnet
收藏Hugging Face2024-04-30 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/sentence-transformers/codesearchnet
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
multilinguality:
- monolingual
size_categories:
- 1M<n<10M
task_categories:
- feature-extraction
- sentence-similarity
pretty_name: CodeSearchNet
tags:
- sentence-transformers
dataset_info:
config_name: pair
features:
- name: comment
dtype: string
- name: code
dtype: string
splits:
- name: train
num_bytes: 1016891681
num_examples: 1375067
download_size: 491861063
dataset_size: 1016891681
configs:
- config_name: pair
data_files:
- split: train
path: pair/train-*
---
# Dataset Card for CodeSearchNet
This dataset is a collection of comment-code pairs of various programming languages. See [code_search_net](https://huggingface.co/datasets/code_search_net) for additional information.
This dataset can be used directly with Sentence Transformers to train embedding models.
## Dataset Subsets
### `pair` subset
* Columns: "comment", "code"
* Column types: `str`, `str`
* Examples:
```python
{
'comment': 'Computes the new parent id for the node being moved.\n\n@return int',
'code': "protected function parentId()\n\t{\n\t\tswitch ( $this->position )\n\t\t{\n\t\t\tcase 'root':\n\t\t\t\treturn null;\n\n\t\t\tcase 'child':\n\t\t\t\treturn $this->target->getKey();\n\n\t\t\tdefault:\n\t\t\t\treturn $this->target->getParentId();\n\t\t}\n\t}",
}
```
* Collection strategy: Reading the Code Search Net dataset from [embedding-training-data](https://huggingface.co/datasets/sentence-transformers/embedding-training-data).
* Deduplified: No
提供机构:
sentence-transformers
原始信息汇总
数据集概述
基本信息
- 名称: CodeSearchNet
- 语言: 英语
- 多语言性: 单语
- 大小: 1M<n<10M
- 任务类别: 特征提取, 句子相似性
- 标签: sentence-transformers
数据集配置
- 配置名称: pair
- 特征:
- comment: 数据类型为字符串
- code: 数据类型为字符串
数据集分割
- 训练集:
- 数据量: 1016891681字节
- 示例数量: 1375067
- 下载大小: 491861063字节
- 数据集大小: 1016891681字节
数据集子集
- 子集名称: pair
- 列: "comment", "code"
- 列类型: 字符串, 字符串
- 示例: python { comment: Computes the new parent id for the node being moved.
@return int, code: "protected function parentId() { switch ( $this->position ) { case root: return null;
case child:
return $this->target->getKey();
default:
return $this->target->getParentId();
}
}",
}
- 收集策略: 从embedding-training-data读取Code Search Net数据集
- 是否去重: 否



