AcapeLlama/AcapeLlama_v2.0_induce_align
收藏Hugging Face2024-04-20 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/AcapeLlama/AcapeLlama_v2.0_induce_align
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: line
features:
- name: title
dtype: string
- name: genre
dtype: string
- name: mungchi
sequence: int64
- name: output
dtype: string
- name: lyrics
dtype: string
- name: instruction
dtype: string
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 922121019.4069357
num_examples: 429100
- name: test
num_bytes: 102458368.59306428
num_examples: 47678
download_size: 425285815
dataset_size: 1024579388.0
- config_name: total
features:
- name: title
dtype: string
- name: genre
dtype: string
- name: mungchi
sequence: int64
- name: output
dtype: string
- name: lyrics
dtype: string
- name: instruction
dtype: string
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 20053146.6
num_examples: 4842
- name: test
num_bytes: 2228127.4
num_examples: 538
download_size: 7266111
dataset_size: 22281274.0
- config_name: verse
features:
- name: title
dtype: string
- name: genre
dtype: string
- name: mungchi
sequence: int64
- name: output
dtype: string
- name: lyrics
dtype: string
- name: instruction
dtype: string
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 90737951.11274482
num_examples: 31523
- name: test
num_bytes: 10083273.887255182
num_examples: 3503
download_size: 40103429
dataset_size: 100821225.0
configs:
- config_name: line
data_files:
- split: train
path: line/train-*
- split: test
path: line/test-*
- config_name: total
data_files:
- split: train
path: total/train-*
- split: test
path: total/test-*
- config_name: verse
data_files:
- split: train
path: verse/train-*
- split: test
path: verse/test-*
---
提供机构:
AcapeLlama
原始信息汇总
数据集概述
配置名称:line
-
特征信息:
- title: 字符串类型
- genre: 字符串类型
- mungchi: 整数序列类型
- output: 字符串类型
- lyrics: 字符串类型
- instruction: 字符串类型
- index_level_0: 整数类型
-
数据分割:
- 训练集:429100个样本,大小为922121019.4069357字节
- 测试集:47678个样本,大小为102458368.59306428字节
-
下载大小: 425285815字节
-
数据集总大小: 1024579388.0字节
配置名称:total
-
特征信息:
- title: 字符串类型
- genre: 字符串类型
- mungchi: 整数序列类型
- output: 字符串类型
- lyrics: 字符串类型
- instruction: 字符串类型
- index_level_0: 整数类型
-
数据分割:
- 训练集:4842个样本,大小为20053146.6字节
- 测试集:538个样本,大小为2228127.4字节
-
下载大小: 7266111字节
-
数据集总大小: 22281274.0字节
配置名称:verse
-
特征信息:
- title: 字符串类型
- genre: 字符串类型
- mungchi: 整数序列类型
- output: 字符串类型
- lyrics: 字符串类型
- instruction: 字符串类型
- index_level_0: 整数类型
-
数据分割:
- 训练集:31523个样本,大小为90737951.11274482字节
- 测试集:3503个样本,大小为10083273.887255182字节
-
下载大小: 40103429字节
-
数据集总大小: 100821225.0字节



