five

iNeil77/the-vault-function

收藏
Hugging Face2024-01-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/iNeil77/the-vault-function
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: c features: - name: hexsha dtype: string - name: repo dtype: string - name: path dtype: string - name: license sequence: string - name: language dtype: string - name: identifier dtype: string - name: return_type dtype: string - name: original_string dtype: string - name: original_docstring dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: code dtype: string - name: code_tokens sequence: string - name: short_docstring dtype: string - name: short_docstring_tokens sequence: string - name: comment sequence: string - name: parameters list: - name: param dtype: string - name: type dtype: string - name: docstring_params struct: - name: returns list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: raises list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: outlier_params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: others list: - name: identifier dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string splits: - name: train num_bytes: 1618612526 num_examples: 381207 - name: validation num_bytes: 118163214 num_examples: 27525 - name: test num_bytes: 82244493 num_examples: 19122 download_size: 601549243 dataset_size: 1819020233 - config_name: cpp features: - name: hexsha dtype: string - name: repo dtype: string - name: path dtype: string - name: license sequence: string - name: language dtype: string - name: identifier dtype: string - name: return_type dtype: string - name: original_string dtype: string - name: original_docstring dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: code dtype: string - name: code_tokens sequence: string - name: short_docstring dtype: string - name: short_docstring_tokens sequence: string - name: comment sequence: string - name: parameters list: - name: param dtype: string - name: type dtype: string - name: docstring_params struct: - name: returns list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: raises list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: outlier_params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: others list: - name: identifier dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string splits: - name: train num_bytes: 1745583444 num_examples: 410907 - name: validation num_bytes: 85254767 num_examples: 20011 - name: test num_bytes: 71686667 num_examples: 18169 download_size: 617392067 dataset_size: 1902524878 - config_name: go features: - name: hexsha dtype: string - name: repo dtype: string - name: path dtype: string - name: license sequence: string - name: language dtype: string - name: identifier dtype: string - name: return_type dtype: string - name: original_string dtype: string - name: original_docstring dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: code dtype: string - name: code_tokens sequence: string - name: short_docstring dtype: string - name: short_docstring_tokens sequence: string - name: comment sequence: string - name: parameters list: - name: param dtype: string - name: type dtype: string - name: docstring_params struct: - name: returns list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: raises list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: outlier_params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: others list: - name: identifier dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string splits: - name: train num_bytes: 3717971602 num_examples: 1319547 - name: validation num_bytes: 50699286 num_examples: 19102 - name: test num_bytes: 71810505 num_examples: 25314 download_size: 1052043326 dataset_size: 3840481393 - config_name: python features: - name: hexsha dtype: string - name: repo dtype: string - name: path dtype: string - name: license sequence: string - name: language dtype: string - name: identifier dtype: string - name: return_type dtype: string - name: original_string dtype: string - name: original_docstring dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: code dtype: string - name: code_tokens sequence: string - name: short_docstring dtype: string - name: short_docstring_tokens sequence: string - name: comment sequence: string - name: parameters list: - name: param dtype: string - name: type dtype: string - name: docstring_params struct: - name: returns list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: raises list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: outlier_params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: others list: - name: identifier dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string splits: - name: train num_bytes: 8545493683 num_examples: 1952110 - name: validation num_bytes: 110572316 num_examples: 30992 - name: test num_bytes: 94502917 num_examples: 21652 download_size: 2953145655 dataset_size: 8750568916 - config_name: ruby features: - name: hexsha dtype: string - name: repo dtype: string - name: path dtype: string - name: license sequence: string - name: language dtype: string - name: identifier dtype: string - name: return_type dtype: string - name: original_string dtype: string - name: original_docstring dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: code dtype: string - name: code_tokens sequence: string - name: short_docstring dtype: string - name: short_docstring_tokens sequence: string - name: comment sequence: string - name: parameters list: - name: param dtype: string - name: type dtype: string - name: docstring_params struct: - name: returns list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: raises list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: outlier_params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: others list: - name: identifier dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string splits: - name: train num_bytes: 358470286 num_examples: 112574 - name: validation num_bytes: 51183541 num_examples: 17338 - name: test num_bytes: 64582951 num_examples: 19908 download_size: 157505004 dataset_size: 474236778 - config_name: rust features: - name: hexsha dtype: string - name: repo dtype: string - name: path dtype: string - name: license sequence: string - name: language dtype: string - name: identifier dtype: string - name: return_type dtype: string - name: original_string dtype: string - name: original_docstring dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: code dtype: string - name: code_tokens sequence: string - name: short_docstring dtype: string - name: short_docstring_tokens sequence: string - name: comment sequence: string - name: parameters list: - name: param dtype: string - name: type dtype: string - name: docstring_params struct: - name: returns list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: raises list: - name: docstring dtype: string - name: docstring_tokens sequence: string - name: type dtype: string - name: params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: outlier_params list: - name: identifier dtype: string - name: type dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string - name: default dtype: string - name: is_optional dtype: bool - name: others list: - name: identifier dtype: string - name: docstring dtype: string - name: docstring_tokens sequence: string splits: - name: train num_bytes: 730827968 num_examples: 224015 - name: validation num_bytes: 60404939 num_examples: 16716 - name: test num_bytes: 87319651 num_examples: 23141 download_size: 279796696 dataset_size: 878552558 configs: - config_name: c data_files: - split: train path: c/train-* - split: validation path: c/validation-* - split: test path: c/test-* - config_name: cpp data_files: - split: train path: cpp/train-* - split: validation path: cpp/validation-* - split: test path: cpp/test-* - config_name: go data_files: - split: train path: go/train-* - split: validation path: go/validation-* - split: test path: go/test-* - config_name: python data_files: - split: train path: python/train-* - split: validation path: python/validation-* - split: test path: python/test-* - config_name: ruby data_files: - split: train path: ruby/train-* - split: validation path: ruby/validation-* - split: test path: ruby/test-* - config_name: rust data_files: - split: train path: rust/train-* - split: validation path: rust/validation-* - split: test path: rust/test-* ---
提供机构:
iNeil77
原始信息汇总

数据集概述

数据集配置

C语言

  • 特征:
    • hexsha: string
    • repo: string
    • path: string
    • license: sequence of string
    • language: string
    • identifier: string
    • return_type: string
    • original_string: string
    • original_docstring: string
    • docstring: string
    • docstring_tokens: sequence of string
    • code: string
    • code_tokens: sequence of string
    • short_docstring: string
    • short_docstring_tokens: sequence of string
    • comment: sequence of string
    • parameters: list of
      • param: string
      • type: string
    • docstring_params: struct of
      • returns: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • raises: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • outlier_params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • others: list of
        • identifier: string
        • docstring: string
        • docstring_tokens: sequence of string
  • 分割:
    • train: 1618612526 bytes, 381207 examples
    • validation: 118163214 bytes, 27525 examples
    • test: 82244493 bytes, 19122 examples
  • 下载大小: 601549243 bytes
  • 数据集大小: 1819020233 bytes

C++

  • 特征:
    • hexsha: string
    • repo: string
    • path: string
    • license: sequence of string
    • language: string
    • identifier: string
    • return_type: string
    • original_string: string
    • original_docstring: string
    • docstring: string
    • docstring_tokens: sequence of string
    • code: string
    • code_tokens: sequence of string
    • short_docstring: string
    • short_docstring_tokens: sequence of string
    • comment: sequence of string
    • parameters: list of
      • param: string
      • type: string
    • docstring_params: struct of
      • returns: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • raises: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • outlier_params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • others: list of
        • identifier: string
        • docstring: string
        • docstring_tokens: sequence of string
  • 分割:
    • train: 1745583444 bytes, 410907 examples
    • validation: 85254767 bytes, 20011 examples
    • test: 71686667 bytes, 18169 examples
  • 下载大小: 617392067 bytes
  • 数据集大小: 1902524878 bytes

Go

  • 特征:
    • hexsha: string
    • repo: string
    • path: string
    • license: sequence of string
    • language: string
    • identifier: string
    • return_type: string
    • original_string: string
    • original_docstring: string
    • docstring: string
    • docstring_tokens: sequence of string
    • code: string
    • code_tokens: sequence of string
    • short_docstring: string
    • short_docstring_tokens: sequence of string
    • comment: sequence of string
    • parameters: list of
      • param: string
      • type: string
    • docstring_params: struct of
      • returns: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • raises: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • outlier_params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • others: list of
        • identifier: string
        • docstring: string
        • docstring_tokens: sequence of string
  • 分割:
    • train: 3717971602 bytes, 1319547 examples
    • validation: 50699286 bytes, 19102 examples
    • test: 71810505 bytes, 25314 examples
  • 下载大小: 1052043326 bytes
  • 数据集大小: 3840481393 bytes

Python

  • 特征:
    • hexsha: string
    • repo: string
    • path: string
    • license: sequence of string
    • language: string
    • identifier: string
    • return_type: string
    • original_string: string
    • original_docstring: string
    • docstring: string
    • docstring_tokens: sequence of string
    • code: string
    • code_tokens: sequence of string
    • short_docstring: string
    • short_docstring_tokens: sequence of string
    • comment: sequence of string
    • parameters: list of
      • param: string
      • type: string
    • docstring_params: struct of
      • returns: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • raises: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • outlier_params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • others: list of
        • identifier: string
        • docstring: string
        • docstring_tokens: sequence of string
  • 分割:
    • train: 8545493683 bytes, 1952110 examples
    • validation: 110572316 bytes, 30992 examples
    • test: 94502917 bytes, 21652 examples
  • 下载大小: 2953145655 bytes
  • 数据集大小: 8750568916 bytes

Ruby

  • 特征:
    • hexsha: string
    • repo: string
    • path: string
    • license: sequence of string
    • language: string
    • identifier: string
    • return_type: string
    • original_string: string
    • original_docstring: string
    • docstring: string
    • docstring_tokens: sequence of string
    • code: string
    • code_tokens: sequence of string
    • short_docstring: string
    • short_docstring_tokens: sequence of string
    • comment: sequence of string
    • parameters: list of
      • param: string
      • type: string
    • docstring_params: struct of
      • returns: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • raises: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • outlier_params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • others: list of
        • identifier: string
        • docstring: string
        • docstring_tokens: sequence of string
  • 分割:
    • train: 358470286 bytes, 112574 examples
    • validation: 51183541 bytes, 17338 examples
    • test: 64582951 bytes, 19908 examples
  • 下载大小: 157505004 bytes
  • 数据集大小: 474236778 bytes

Rust

  • 特征:
    • hexsha: string
    • repo: string
    • path: string
    • license: sequence of string
    • language: string
    • identifier: string
    • return_type: string
    • original_string: string
    • original_docstring: string
    • docstring: string
    • docstring_tokens: sequence of string
    • code: string
    • code_tokens: sequence of string
    • short_docstring: string
    • short_docstring_tokens: sequence of string
    • comment: sequence of string
    • parameters: list of
      • param: string
      • type: string
    • docstring_params: struct of
      • returns: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • raises: list of
        • docstring: string
        • docstring_tokens: sequence of string
        • type: string
      • params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • outlier_params: list of
        • identifier: string
        • type: string
        • docstring: string
        • docstring_tokens: sequence of string
        • default: string
        • is_optional: bool
      • others: list of
        • identifier: string
        • docstring: string
        • docstring_tokens: sequence of string
  • 分割:
    • train: 730827968 bytes, 224015 examples
    • validation: 60404939 bytes, 16716 examples
    • test: 87319651 bytes, 23141 examples
  • 下载大小: 279796696 bytes
  • 数据集大小: 878552558 bytes
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作