CoIR-Retrieval/codefeedback-st-queries-corpus
收藏Hugging Face2024-09-12 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/CoIR-Retrieval/codefeedback-st-queries-corpus
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: _id
dtype: string
- name: partition
dtype: string
- name: text
dtype: string
- name: title
dtype: string
- name: language
dtype: string
- name: meta_information
struct:
- name: resource
dtype: string
splits:
- name: queries
num_bytes: 118682563
num_examples: 156526
- name: corpus
num_bytes: 246229656
num_examples: 156526
download_size: 181151457
dataset_size: 364912219
---
Employing the CoIR evaluation framework's dataset version, utilize the code below for assessment:
```python
import coir
from coir.data_loader import get_tasks
from coir.evaluation import COIR
from coir.models import YourCustomDEModel
model_name = "intfloat/e5-base-v2"
# Load the model
model = YourCustomDEModel(model_name=model_name)
# Get tasks
#all task ["codetrans-dl","stackoverflow-qa","apps","codefeedback-mt","codefeedback-st","codetrans-contest","synthetic-
# text2sql","cosqa","codesearchnet","codesearchnet-ccr"]
tasks = get_tasks(tasks=["codetrans-dl"])
# Initialize evaluation
evaluation = COIR(tasks=tasks,batch_size=128)
# Run evaluation
results = evaluation.run(model, output_folder=f"results/{model_name}")
print(results)
```
提供机构:
CoIR-Retrieval
原始信息汇总
数据集概述
数据集名称
codefeedback-st-queries-corpus
数据集特征
- _id: 字符串类型
- partition: 字符串类型
- text: 字符串类型
- title: 字符串类型
- language: 字符串类型
- meta_information: 结构体类型
- resource: 字符串类型
数据集分割
- queries:
- 字节数: 118682563
- 样本数: 156526
- corpus:
- 字节数: 246229656
- 样本数: 156526
数据集大小
- 下载大小: 181151457 字节
- 数据集总大小: 364912219 字节



