iNeil77/the-vault-function
收藏Hugging Face2024-01-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/iNeil77/the-vault-function
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: c
features:
- name: hexsha
dtype: string
- name: repo
dtype: string
- name: path
dtype: string
- name: license
sequence: string
- name: language
dtype: string
- name: identifier
dtype: string
- name: return_type
dtype: string
- name: original_string
dtype: string
- name: original_docstring
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: code
dtype: string
- name: code_tokens
sequence: string
- name: short_docstring
dtype: string
- name: short_docstring_tokens
sequence: string
- name: comment
sequence: string
- name: parameters
list:
- name: param
dtype: string
- name: type
dtype: string
- name: docstring_params
struct:
- name: returns
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: raises
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: outlier_params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: others
list:
- name: identifier
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
splits:
- name: train
num_bytes: 1618612526
num_examples: 381207
- name: validation
num_bytes: 118163214
num_examples: 27525
- name: test
num_bytes: 82244493
num_examples: 19122
download_size: 601549243
dataset_size: 1819020233
- config_name: cpp
features:
- name: hexsha
dtype: string
- name: repo
dtype: string
- name: path
dtype: string
- name: license
sequence: string
- name: language
dtype: string
- name: identifier
dtype: string
- name: return_type
dtype: string
- name: original_string
dtype: string
- name: original_docstring
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: code
dtype: string
- name: code_tokens
sequence: string
- name: short_docstring
dtype: string
- name: short_docstring_tokens
sequence: string
- name: comment
sequence: string
- name: parameters
list:
- name: param
dtype: string
- name: type
dtype: string
- name: docstring_params
struct:
- name: returns
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: raises
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: outlier_params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: others
list:
- name: identifier
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
splits:
- name: train
num_bytes: 1745583444
num_examples: 410907
- name: validation
num_bytes: 85254767
num_examples: 20011
- name: test
num_bytes: 71686667
num_examples: 18169
download_size: 617392067
dataset_size: 1902524878
- config_name: go
features:
- name: hexsha
dtype: string
- name: repo
dtype: string
- name: path
dtype: string
- name: license
sequence: string
- name: language
dtype: string
- name: identifier
dtype: string
- name: return_type
dtype: string
- name: original_string
dtype: string
- name: original_docstring
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: code
dtype: string
- name: code_tokens
sequence: string
- name: short_docstring
dtype: string
- name: short_docstring_tokens
sequence: string
- name: comment
sequence: string
- name: parameters
list:
- name: param
dtype: string
- name: type
dtype: string
- name: docstring_params
struct:
- name: returns
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: raises
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: outlier_params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: others
list:
- name: identifier
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
splits:
- name: train
num_bytes: 3717971602
num_examples: 1319547
- name: validation
num_bytes: 50699286
num_examples: 19102
- name: test
num_bytes: 71810505
num_examples: 25314
download_size: 1052043326
dataset_size: 3840481393
- config_name: python
features:
- name: hexsha
dtype: string
- name: repo
dtype: string
- name: path
dtype: string
- name: license
sequence: string
- name: language
dtype: string
- name: identifier
dtype: string
- name: return_type
dtype: string
- name: original_string
dtype: string
- name: original_docstring
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: code
dtype: string
- name: code_tokens
sequence: string
- name: short_docstring
dtype: string
- name: short_docstring_tokens
sequence: string
- name: comment
sequence: string
- name: parameters
list:
- name: param
dtype: string
- name: type
dtype: string
- name: docstring_params
struct:
- name: returns
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: raises
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: outlier_params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: others
list:
- name: identifier
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
splits:
- name: train
num_bytes: 8545493683
num_examples: 1952110
- name: validation
num_bytes: 110572316
num_examples: 30992
- name: test
num_bytes: 94502917
num_examples: 21652
download_size: 2953145655
dataset_size: 8750568916
- config_name: ruby
features:
- name: hexsha
dtype: string
- name: repo
dtype: string
- name: path
dtype: string
- name: license
sequence: string
- name: language
dtype: string
- name: identifier
dtype: string
- name: return_type
dtype: string
- name: original_string
dtype: string
- name: original_docstring
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: code
dtype: string
- name: code_tokens
sequence: string
- name: short_docstring
dtype: string
- name: short_docstring_tokens
sequence: string
- name: comment
sequence: string
- name: parameters
list:
- name: param
dtype: string
- name: type
dtype: string
- name: docstring_params
struct:
- name: returns
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: raises
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: outlier_params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: others
list:
- name: identifier
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
splits:
- name: train
num_bytes: 358470286
num_examples: 112574
- name: validation
num_bytes: 51183541
num_examples: 17338
- name: test
num_bytes: 64582951
num_examples: 19908
download_size: 157505004
dataset_size: 474236778
- config_name: rust
features:
- name: hexsha
dtype: string
- name: repo
dtype: string
- name: path
dtype: string
- name: license
sequence: string
- name: language
dtype: string
- name: identifier
dtype: string
- name: return_type
dtype: string
- name: original_string
dtype: string
- name: original_docstring
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: code
dtype: string
- name: code_tokens
sequence: string
- name: short_docstring
dtype: string
- name: short_docstring_tokens
sequence: string
- name: comment
sequence: string
- name: parameters
list:
- name: param
dtype: string
- name: type
dtype: string
- name: docstring_params
struct:
- name: returns
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: raises
list:
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: type
dtype: string
- name: params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: outlier_params
list:
- name: identifier
dtype: string
- name: type
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
- name: default
dtype: string
- name: is_optional
dtype: bool
- name: others
list:
- name: identifier
dtype: string
- name: docstring
dtype: string
- name: docstring_tokens
sequence: string
splits:
- name: train
num_bytes: 730827968
num_examples: 224015
- name: validation
num_bytes: 60404939
num_examples: 16716
- name: test
num_bytes: 87319651
num_examples: 23141
download_size: 279796696
dataset_size: 878552558
configs:
- config_name: c
data_files:
- split: train
path: c/train-*
- split: validation
path: c/validation-*
- split: test
path: c/test-*
- config_name: cpp
data_files:
- split: train
path: cpp/train-*
- split: validation
path: cpp/validation-*
- split: test
path: cpp/test-*
- config_name: go
data_files:
- split: train
path: go/train-*
- split: validation
path: go/validation-*
- split: test
path: go/test-*
- config_name: python
data_files:
- split: train
path: python/train-*
- split: validation
path: python/validation-*
- split: test
path: python/test-*
- config_name: ruby
data_files:
- split: train
path: ruby/train-*
- split: validation
path: ruby/validation-*
- split: test
path: ruby/test-*
- config_name: rust
data_files:
- split: train
path: rust/train-*
- split: validation
path: rust/validation-*
- split: test
path: rust/test-*
---
提供机构:
iNeil77
原始信息汇总
数据集概述
数据集配置
C语言
- 特征:
hexsha: stringrepo: stringpath: stringlicense: sequence of stringlanguage: stringidentifier: stringreturn_type: stringoriginal_string: stringoriginal_docstring: stringdocstring: stringdocstring_tokens: sequence of stringcode: stringcode_tokens: sequence of stringshort_docstring: stringshort_docstring_tokens: sequence of stringcomment: sequence of stringparameters: list ofparam: stringtype: string
docstring_params: struct ofreturns: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
raises: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
outlier_params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
others: list ofidentifier: stringdocstring: stringdocstring_tokens: sequence of string
- 分割:
train: 1618612526 bytes, 381207 examplesvalidation: 118163214 bytes, 27525 examplestest: 82244493 bytes, 19122 examples
- 下载大小: 601549243 bytes
- 数据集大小: 1819020233 bytes
C++
- 特征:
hexsha: stringrepo: stringpath: stringlicense: sequence of stringlanguage: stringidentifier: stringreturn_type: stringoriginal_string: stringoriginal_docstring: stringdocstring: stringdocstring_tokens: sequence of stringcode: stringcode_tokens: sequence of stringshort_docstring: stringshort_docstring_tokens: sequence of stringcomment: sequence of stringparameters: list ofparam: stringtype: string
docstring_params: struct ofreturns: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
raises: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
outlier_params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
others: list ofidentifier: stringdocstring: stringdocstring_tokens: sequence of string
- 分割:
train: 1745583444 bytes, 410907 examplesvalidation: 85254767 bytes, 20011 examplestest: 71686667 bytes, 18169 examples
- 下载大小: 617392067 bytes
- 数据集大小: 1902524878 bytes
Go
- 特征:
hexsha: stringrepo: stringpath: stringlicense: sequence of stringlanguage: stringidentifier: stringreturn_type: stringoriginal_string: stringoriginal_docstring: stringdocstring: stringdocstring_tokens: sequence of stringcode: stringcode_tokens: sequence of stringshort_docstring: stringshort_docstring_tokens: sequence of stringcomment: sequence of stringparameters: list ofparam: stringtype: string
docstring_params: struct ofreturns: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
raises: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
outlier_params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
others: list ofidentifier: stringdocstring: stringdocstring_tokens: sequence of string
- 分割:
train: 3717971602 bytes, 1319547 examplesvalidation: 50699286 bytes, 19102 examplestest: 71810505 bytes, 25314 examples
- 下载大小: 1052043326 bytes
- 数据集大小: 3840481393 bytes
Python
- 特征:
hexsha: stringrepo: stringpath: stringlicense: sequence of stringlanguage: stringidentifier: stringreturn_type: stringoriginal_string: stringoriginal_docstring: stringdocstring: stringdocstring_tokens: sequence of stringcode: stringcode_tokens: sequence of stringshort_docstring: stringshort_docstring_tokens: sequence of stringcomment: sequence of stringparameters: list ofparam: stringtype: string
docstring_params: struct ofreturns: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
raises: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
outlier_params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
others: list ofidentifier: stringdocstring: stringdocstring_tokens: sequence of string
- 分割:
train: 8545493683 bytes, 1952110 examplesvalidation: 110572316 bytes, 30992 examplestest: 94502917 bytes, 21652 examples
- 下载大小: 2953145655 bytes
- 数据集大小: 8750568916 bytes
Ruby
- 特征:
hexsha: stringrepo: stringpath: stringlicense: sequence of stringlanguage: stringidentifier: stringreturn_type: stringoriginal_string: stringoriginal_docstring: stringdocstring: stringdocstring_tokens: sequence of stringcode: stringcode_tokens: sequence of stringshort_docstring: stringshort_docstring_tokens: sequence of stringcomment: sequence of stringparameters: list ofparam: stringtype: string
docstring_params: struct ofreturns: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
raises: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
outlier_params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
others: list ofidentifier: stringdocstring: stringdocstring_tokens: sequence of string
- 分割:
train: 358470286 bytes, 112574 examplesvalidation: 51183541 bytes, 17338 examplestest: 64582951 bytes, 19908 examples
- 下载大小: 157505004 bytes
- 数据集大小: 474236778 bytes
Rust
- 特征:
hexsha: stringrepo: stringpath: stringlicense: sequence of stringlanguage: stringidentifier: stringreturn_type: stringoriginal_string: stringoriginal_docstring: stringdocstring: stringdocstring_tokens: sequence of stringcode: stringcode_tokens: sequence of stringshort_docstring: stringshort_docstring_tokens: sequence of stringcomment: sequence of stringparameters: list ofparam: stringtype: string
docstring_params: struct ofreturns: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
raises: list ofdocstring: stringdocstring_tokens: sequence of stringtype: string
params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
outlier_params: list ofidentifier: stringtype: stringdocstring: stringdocstring_tokens: sequence of stringdefault: stringis_optional: bool
others: list ofidentifier: stringdocstring: stringdocstring_tokens: sequence of string
- 分割:
train: 730827968 bytes, 224015 examplesvalidation: 60404939 bytes, 16716 examplestest: 87319651 bytes, 23141 examples
- 下载大小: 279796696 bytes
- 数据集大小: 878552558 bytes
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



