huggingartists/dermot-kennedy

Name: huggingartists/dermot-kennedy
Creator: huggingartists
Published: 2022-10-25 09:27:42
License: 暂无描述

Hugging Face2022-10-25 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/huggingartists/dermot-kennedy

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含从Genius解析的歌词数据，旨在用于生成歌词。数据集的大小为0.150085 MB，语言为英语。数据集的结构包括一个名为text的字段，所有分割的数据字段都相同。数据集的使用方法包括如何加载数据集以及如何将训练数据分割为训练集、验证集和测试集。

This dataset contains lyric data parsed from Genius, and is intended for lyric generation tasks. It has a size of 0.150085 MB and is in English. The dataset structure includes a field named `text`, and all split subsets share the same field schema. The usage guidelines for this dataset cover how to load the dataset and split the training data into training, validation, and test sets.

提供机构：

huggingartists

原始信息汇总

数据集概述

数据集描述

数据集总结

名称: huggingartists/dermot-kennedy
内容: 从Genius解析的歌词数据集，用于生成歌词。
模型位置: HuggingArtists Model

支持的任务和排行榜

信息: 待补充

语言

语言: 英语 (en)

数据集结构

数据字段

text: 字符串类型，包含歌词文本。

数据分割

训练集: 77条记录
验证集/测试集: 未明确分割

如何使用

加载数据集

python from datasets import load_dataset

dataset = load_dataset("huggingartists/dermot-kennedy")

分割数据集

python from datasets import load_dataset, Dataset, DatasetDict import numpy as np

datasets = load_dataset("huggingartists/dermot-kennedy")

train_percentage = 0.9 validation_percentage = 0.07 test_percentage = 0.03

train, validation, test = np.split(datasets[train][text], [int(len(datasets[train][text])train_percentage), int(len(datasets[train][text])(train_percentage + validation_percentage))])

datasets = DatasetDict( { train: Dataset.from_dict({text: list(train)}), validation: Dataset.from_dict({text: list(validation)}), test: Dataset.from_dict({text: list(test)}) } )

数据集创建

来源数据

初始数据收集和标准化: 待补充
源语言生产者: 待补充

注释

注释过程: 待补充
注释者: 待补充

个人和敏感信息

信息: 待补充

使用数据的考虑

数据集的社会影响

信息: 待补充

偏见讨论

信息: 待补充

其他已知限制

信息: 待补充

附加信息

数据集管理员

信息: 待补充

许可信息

信息: 待补充

引用信息

@InProceedings{huggingartists, author={Aleksey Korshuk} year=2021 }

5,000+

优质数据集

54 个

任务类型

进入经典数据集