five

huggingartists/armin-van-buuren

收藏
Hugging Face2022-10-25 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/huggingartists/armin-van-buuren
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含从Genius解析的歌词数据,旨在与HuggingArtists一起生成歌词。数据集的生成大小为0.358063 MB,语言为英语。数据集结构包括一个名为text的字段,所有分割的数据字段相同。数据分割包括546个训练样本,没有验证和测试样本,但可以通过代码进行分割。
提供机构:
huggingartists
原始信息汇总

Dataset Card for "huggingartists/armin-van-buuren"

Dataset Description

  • Size of the generated dataset: 0.358063 MB

Dataset Summary

The Lyrics dataset parsed from Genius. This dataset is designed to generate lyrics with HuggingArtists.

Supported Tasks and Leaderboards

More Information Needed

Languages

en

How to use

How to load this dataset directly with the datasets library:

python from datasets import load_dataset

dataset = load_dataset("huggingartists/armin-van-buuren")

Dataset Structure

Data Fields

The data fields are the same among all splits.

  • text: a string feature.

Data Splits

train validation test
546 - -

Train can be easily divided into train & validation & test with few lines of code:

python from datasets import load_dataset, Dataset, DatasetDict import numpy as np

datasets = load_dataset("huggingartists/armin-van-buuren")

train_percentage = 0.9 validation_percentage = 0.07 test_percentage = 0.03

train, validation, test = np.split(datasets[train][text], [int(len(datasets[train][text])train_percentage), int(len(datasets[train][text])(train_percentage + validation_percentage))])

datasets = DatasetDict( { train: Dataset.from_dict({text: list(train)}), validation: Dataset.from_dict({text: list(validation)}), test: Dataset.from_dict({text: list(test)}) } )

Dataset Creation

Curation Rationale

More Information Needed

Source Data

Initial Data Collection and Normalization

More Information Needed

Who are the source language producers?

More Information Needed

Annotations

Annotation process

More Information Needed

Who are the annotators?

More Information Needed

Personal and Sensitive Information

More Information Needed

Considerations for Using the Data

Social Impact of Dataset

More Information Needed

Discussion of Biases

More Information Needed

Other Known Limitations

More Information Needed

Additional Information

Dataset Curators

More Information Needed

Licensing Information

More Information Needed

Citation Information

@InProceedings{huggingartists, author={Aleksey Korshuk} year=2021 }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作