mvasiliniuc/iva-swift-codeint-clean-train-tokenized
收藏Hugging Face2023-06-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/mvasiliniuc/iva-swift-codeint-clean-train-tokenized
下载链接
链接失效反馈官方服务:
资源简介:
---
license: other
dataset_info:
features:
- name: ratio
dtype: float64
- name: config_or_test
dtype: bool
- name: has_no_keywords
dtype: bool
- name: has_few_assignments
dtype: bool
- name: input_ids
sequence: int32
- name: ratio_char_token
dtype: float64
splits:
- name: train
num_bytes: 971849564
num_examples: 400000
download_size: 484282225
dataset_size: 971849564
---
提供机构:
mvasiliniuc
原始信息汇总
数据集概述
数据集特征
- ratio:浮点型数据
- config_or_test:布尔型数据
- has_no_keywords:布尔型数据
- has_few_assignments:布尔型数据
- input_ids:整数序列,类型为int32
- ratio_char_token:浮点型数据
数据集划分
- train:
- 数据量:400000个样本
- 存储大小:971849564字节
数据集大小
- 下载大小:484282225字节
- 数据集总大小:971849564字节



