huggingartists/armin-van-buuren
收藏Dataset Card for "huggingartists/armin-van-buuren"
Dataset Description
- Size of the generated dataset: 0.358063 MB
Dataset Summary
The Lyrics dataset parsed from Genius. This dataset is designed to generate lyrics with HuggingArtists.
Supported Tasks and Leaderboards
Languages
en
How to use
How to load this dataset directly with the datasets library:
python from datasets import load_dataset
dataset = load_dataset("huggingartists/armin-van-buuren")
Dataset Structure
Data Fields
The data fields are the same among all splits.
text: astringfeature.
Data Splits
| train | validation | test |
|---|---|---|
| 546 | - | - |
Train can be easily divided into train & validation & test with few lines of code:
python from datasets import load_dataset, Dataset, DatasetDict import numpy as np
datasets = load_dataset("huggingartists/armin-van-buuren")
train_percentage = 0.9 validation_percentage = 0.07 test_percentage = 0.03
train, validation, test = np.split(datasets[train][text], [int(len(datasets[train][text])train_percentage), int(len(datasets[train][text])(train_percentage + validation_percentage))])
datasets = DatasetDict( { train: Dataset.from_dict({text: list(train)}), validation: Dataset.from_dict({text: list(validation)}), test: Dataset.from_dict({text: list(test)}) } )
Dataset Creation
Curation Rationale
Source Data
Initial Data Collection and Normalization
Who are the source language producers?
Annotations
Annotation process
Who are the annotators?
Personal and Sensitive Information
Considerations for Using the Data
Social Impact of Dataset
Discussion of Biases
Other Known Limitations
Additional Information
Dataset Curators
Licensing Information
Citation Information
@InProceedings{huggingartists, author={Aleksey Korshuk} year=2021 }



