OBF/reg-ds
收藏Hugging Face2024-02-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/OBF/reg-ds
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: c
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 13442958357.844296
num_examples: 1949000
download_size: 4580467350
dataset_size: 13442958357.844296
- config_name: cpp
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 14983809296.175602
num_examples: 1828000
download_size: 4908871876
dataset_size: 14983809296.175602
- config_name: go
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 13755677598.426552
num_examples: 2767000
download_size: 4676263414
dataset_size: 13755677598.426552
- config_name: haskell
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 2020257310
num_examples: 485958
download_size: 795151293
dataset_size: 2020257310
- config_name: java
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 18941088346.83762
num_examples: 4145000
download_size: 6143582915
dataset_size: 18941088346.83762
- config_name: python
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 16163190026.156303
num_examples: 3275000
download_size: 5960453385
dataset_size: 16163190026.156303
- config_name: rust
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 7714478040
num_examples: 1153702
download_size: 2401120332
dataset_size: 7714478040
- config_name: typescript
features:
- name: content
dtype: string
splits:
- name: train
num_bytes: 13589061648
num_examples: 4979083
download_size: 4965715796
dataset_size: 13589061648
configs:
- config_name: c
data_files:
- split: train
path: c/train-*
- config_name: cpp
data_files:
- split: train
path: cpp/train-*
- config_name: go
data_files:
- split: train
path: go/train-*
- config_name: haskell
data_files:
- split: train
path: haskell/train-*
- config_name: java
data_files:
- split: train
path: java/train-*
- config_name: python
data_files:
- split: train
path: python/train-*
- config_name: rust
data_files:
- split: train
path: rust/train-*
- config_name: typescript
data_files:
- split: train
path: typescript/train-*
---
提供机构:
OBF
原始信息汇总
数据集概述
配置信息
C语言
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 13442958357.844296
- 样本数: 1949000
- 下载大小: 4580467350
- 数据集大小: 13442958357.844296
C++
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 14983809296.175602
- 样本数: 1828000
- 下载大小: 4908871876
- 数据集大小: 14983809296.175602
Go
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 13755677598.426552
- 样本数: 2767000
- 下载大小: 4676263414
- 数据集大小: 13755677598.426552
Haskell
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 2020257310
- 样本数: 485958
- 下载大小: 795151293
- 数据集大小: 2020257310
Java
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 18941088346.83762
- 样本数: 4145000
- 下载大小: 6143582915
- 数据集大小: 18941088346.83762
Python
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 16163190026.156303
- 样本数: 3275000
- 下载大小: 5960453385
- 数据集大小: 16163190026.156303
Rust
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 7714478040
- 样本数: 1153702
- 下载大小: 2401120332
- 数据集大小: 7714478040
TypeScript
- 特征:
- 名称: content
- 数据类型: string
- 分割:
- 名称: train
- 字节数: 13589061648
- 样本数: 4979083
- 下载大小: 4965715796
- 数据集大小: 13589061648
数据文件路径
-
C语言:
- 分割: train
- 路径: c/train-*
-
C++:
- 分割: train
- 路径: cpp/train-*
-
Go:
- 分割: train
- 路径: go/train-*
-
Haskell:
- 分割: train
- 路径: haskell/train-*
-
Java:
- 分割: train
- 路径: java/train-*
-
Python:
- 分割: train
- 路径: python/train-*
-
Rust:
- 分割: train
- 路径: rust/train-*
-
TypeScript:
- 分割: train
- 路径: typescript/train-*



