five

TTxdfcgv7/suno

收藏
Hugging Face2026-02-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/TTxdfcgv7/suno
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: Suno Music Generation Dataset size_categories: - 100K<n<1M task_categories: - audio-classification - text-to-audio annotations_creators: - found language: - en - ja - multilingual license: cc0-1.0 multilinguality: - multilingual source_datasets: - original tags: - audio - video - image - text --- # Dataset Card for Suno.ai Music Generation ### Dataset Summary This dataset contains metadata for 659,788 songs generated by artificial intelligence on the [suno.com](https://suno.com) platform, a service that generates music using artificial intelligence. The songs were discovered by search queries with words from the [dwyl/english-words](https://github.com/dwyl/english-words) word list. ### Languages The dataset is multilingual with English as the primary language: - English (en): Primary language for metadata and most lyrics - Japanese (ja): Present in some song lyrics and titles - Other languages may appear in song lyrics and titles ## Dataset Structure ### Data Fields The dataset is stored in Parquet format with zstd compression. Each record includes: **Core Fields:** - `id`: Unique identifier for the song (string) - `video_url`: URL of the video version (string) - `audio_url`: URL of the audio file (string) - `image_url`: URL of the song thumbnail (string) - `image_large_url`: URL of the large cover image (string) - `is_video_pending`: Video processing status (boolean) - `major_model_version`: Version of AI model used (string) - `model_name`: Name of the model used (string) - `is_liked`: Like status (boolean) - `user_id`: Creator's ID (string) - `display_name`: Creator's display name (string) - `handle`: Creator's handle (string) - `is_handle_updated`: Handle update status (boolean) - `avatar_image_url`: Creator's avatar URL (string) - `is_trashed`: Deletion status (boolean) - `created_at`: Creation timestamp (string) - `status`: Generation status (string) - `title`: Song title (string) - `play_count`: Number of plays (integer) - `upvote_count`: Number of upvotes (integer) - `is_public`: Public visibility status (boolean) - `persona`: Creator persona information (JSON string, nullable) **Metadata Fields (flattened from nested structure):** - `metadata_tags`: Musical style and genre tags (string) - `metadata_prompt`: Lyrics/prompt used to generate the song (string) - `metadata_type`: Generation type (string) - `metadata_duration`: Length of song in seconds (float) - `metadata_refund_credits`: Refund status (boolean) - `metadata_stream`: Streaming availability (boolean) - `metadata_gpt_description_prompt`: GPT description prompt (string) - `metadata_has_vocal`: Whether the song has vocals (boolean) - `metadata_negative_tags`: Negative style tags (string) - `metadata_error_message`: Error message if generation failed (string) - `metadata_error_type`: Error type if generation failed (string) - `metadata_artist_clip_id`: Artist clip reference ID (string) - `metadata_cover_clip_id`: Cover clip reference ID (string) - `metadata_edit_session_id`: Edit session ID (string) - `metadata_stem_from_id`: Stem source ID (string) - `metadata_persona_id`: Persona ID used for generation (string) - `metadata_task`: Generation task type (string) - `metadata_is_audio_upload_tos_accepted`: TOS acceptance status (boolean) - `metadata_concat_history`: Concatenation history for extended songs (JSON string) - `metadata_history`: Generation history (JSON string) - `metadata_infill`: Infill generation parameters (JSON string) ### Data Splits All 659,788 songs are in a single train split. ## Additional Information ### License This dataset is dedicated to the public domain under the Creative Commons Zero (CC0) license. This means you can: * Use it for any purpose, including commercial projects. * Modify it however you like. * Distribute it without asking permission. No attribution is required, but it's always appreciated! CC0 license: https://creativecommons.org/publicdomain/zero/1.0/deed.en
提供机构:
TTxdfcgv7
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作