CoIR-Retrieval/apps-queries-corpus
收藏Hugging Face2024-09-12 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/CoIR-Retrieval/apps-queries-corpus
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: _id
dtype: string
- name: partition
dtype: string
- name: text
dtype: string
- name: language
dtype: string
- name: meta_information
struct:
- name: starter_code
dtype: string
- name: url
dtype: string
- name: title
dtype: string
splits:
- name: queries
num_bytes: 13633677
num_examples: 8765
- name: corpus
num_bytes: 6044437
num_examples: 8765
download_size: 0
dataset_size: 19678114
---
Employing the CoIR evaluation framework's dataset version, utilize the code below for assessment:
```python
import coir
from coir.data_loader import get_tasks
from coir.evaluation import COIR
from coir.models import YourCustomDEModel
model_name = "intfloat/e5-base-v2"
# Load the model
model = YourCustomDEModel(model_name=model_name)
# Get tasks
#all task ["codetrans-dl","stackoverflow-qa","apps","codefeedback-mt","codefeedback-st","codetrans-contest","synthetic-
# text2sql","cosqa","codesearchnet","codesearchnet-ccr"]
tasks = get_tasks(tasks=["codetrans-dl"])
# Initialize evaluation
evaluation = COIR(tasks=tasks,batch_size=128)
# Run evaluation
results = evaluation.run(model, output_folder=f"results/{model_name}")
print(results)
```
提供机构:
CoIR-Retrieval
原始信息汇总
数据集概述
数据集名称
apps-queries-corpus
数据集特征
- _id: 字符串类型
- partition: 字符串类型
- text: 字符串类型
- language: 字符串类型
- meta_information: 结构体类型
- starter_code: 字符串类型
- url: 字符串类型
- title: 字符串类型
数据集分割
- queries:
- 字节数: 13633677
- 样本数: 8765
- corpus:
- 字节数: 6044437
- 样本数: 8765
数据集大小
- 下载大小: 0
- 数据集大小: 19678114



