CronosGhost/code-reranking
收藏Hugging Face2024-03-20 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/CronosGhost/code-reranking
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
dataset_info:
- config_name: CodeLangQueries
features:
- name: query
dtype: string
- name: positive
sequence: string
- name: negative
sequence: string
splits:
- name: train
num_bytes: 23150542.5
num_examples: 9900
- name: test
num_bytes: 2572282.5
num_examples: 1100
download_size: 10367838
dataset_size: 25722825.0
- config_name: CodeLangQueries-MachineGeneratedDocs
features:
- name: query
dtype: string
- name: positive
dtype: string
- name: negative
sequence: string
splits:
- name: train
num_bytes: 373862.7
num_examples: 495
- name: test
num_bytes: 41540.3
num_examples: 55
download_size: 166214
dataset_size: 415403.0
- config_name: NaturalLangQueries
features:
- name: query
dtype: string
- name: positive
sequence: string
- name: negative
sequence: string
splits:
- name: train
num_bytes: 62984485.8
num_examples: 9900
- name: test
num_bytes: 6998276.2
num_examples: 1100
download_size: 29469643
dataset_size: 69982762.0
- config_name: default
features:
- name: query
dtype: string
- name: positive
sequence: string
- name: negative
sequence: string
splits:
- name: train
num_bytes: 23176584.9
num_examples: 9900
- name: test
num_bytes: 2575176.1
num_examples: 1100
download_size: 10376964
dataset_size: 25751761.0
configs:
- config_name: CodeLangQueries
data_files:
- split: train
path: CodeLangQueries/train-*
- split: test
path: CodeLangQueries/test-*
- config_name: CodeLangQueries-MachineGeneratedDocs
data_files:
- split: train
path: CodeLangQueries-MachineGeneratedDocs/train-*
- split: test
path: CodeLangQueries-MachineGeneratedDocs/test-*
- config_name: NaturalLangQueries
data_files:
- split: train
path: NaturalLangQueries/train-*
- split: test
path: NaturalLangQueries/test-*
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
提供机构:
CronosGhost
原始信息汇总
数据集概述
数据集配置
CodeLangQueries
- 特征:
query: 数据类型为stringpositive: 数据类型为sequence的stringnegative: 数据类型为sequence的string
- 分割:
train: 字节数为23150542.5,样本数为9900test: 字节数为2572282.5,样本数为1100
- 下载大小:
10367838字节 - 数据集大小:
25722825.0字节
CodeLangQueries-MachineGeneratedDocs
- 特征:
query: 数据类型为stringpositive: 数据类型为stringnegative: 数据类型为sequence的string
- 分割:
train: 字节数为373862.7,样本数为495test: 字节数为41540.3,样本数为55
- 下载大小:
166214字节 - 数据集大小:
415403.0字节
NaturalLangQueries
- 特征:
query: 数据类型为stringpositive: 数据类型为sequence的stringnegative: 数据类型为sequence的string
- 分割:
train: 字节数为62984485.8,样本数为9900test: 字节数为6998276.2,样本数为1100
- 下载大小:
29469643字节 - 数据集大小:
69982762.0字节
default
- 特征:
query: 数据类型为stringpositive: 数据类型为sequence的stringnegative: 数据类型为sequence的string
- 分割:
train: 字节数为23176584.9,样本数为9900test: 字节数为2575176.1,样本数为1100
- 下载大小:
10376964字节 - 数据集大小:
25751761.0字节
数据文件路径
CodeLangQueries
- 训练集:
CodeLangQueries/train-* - 测试集:
CodeLangQueries/test-*
CodeLangQueries-MachineGeneratedDocs
- 训练集:
CodeLangQueries-MachineGeneratedDocs/train-* - 测试集:
CodeLangQueries-MachineGeneratedDocs/test-*
NaturalLangQueries
- 训练集:
NaturalLangQueries/train-* - 测试集:
NaturalLangQueries/test-*
default
- 训练集:
data/train-* - 测试集:
data/test-*



