Udio-24MX1
收藏魔搭社区2025-12-05 更新2025-08-09 收录
下载链接:
https://modelscope.cn/datasets/sleeping-ai/Udio-24MX1
下载链接
链接失效反馈官方服务:
资源简介:
Excited to share, 23.3M (million) artificial intelligence generated songs from one of the largest commercial text2music and LTS (Lyric-to-Song) provider Udio. It is the largest known collection of till this date including its exquisite metadata.
We release this dataset part of our on-going effort at Sleeping AI and upcoming paper.
### What are the data fields?
- **id**: Unique UUID assigned to the metadata entry.
- **user_id**: UUID of the user who generated/uploaded the song.
- **artist**: Name of the artist.
- **artist_image**: URL to the artist’s profile image.
- **title**: Title of the song.
- **created_at**: Timestamp when the song was created (UTC).
- **error_id**: Nullable field for error tracking (null if no error).
- **error_type**: Nullable field indicating the error type.
- **error_code**: Nullable numeric or string code for the error.
- **generation_id**: UUID used internally for tracking generation.
- **image_path**: URL of the song's cover/image.
- **lyrics**: Lyrics of the song.
- **prompt**: Original text prompt used to generate the song.
- **likes**: Number of likes the song received.
- **plays**: Total number of times the song has been played.
- **published_at**: Timestamp when the song was published (UTC).
- **replaced_tags**: Tags that were replaced during processing (null here).
- **song_path**: URL to the downloadable `.mp3` audio file.
- **tags**: List of tags associated with the song (genres, mood, etc.).
- **duration**: Duration of the song in seconds.
- **video_path**: URL to the song’s accompanying video (e.g. `.mp4`).
- **error_detail**: Optional field describing any errors.
- **finished**: Boolean indicating whether the generation process is complete.
- **liked**: Boolean indicating whether the user liked the song.
- **disliked**: Boolean indicating whether the user disliked the song.
- **publishable**: Boolean indicating whether the song is publicly shareable.
- **audio_conditioning_type**: Null or string describing special conditioning.
- **attribution**: Text describing the source/attribution (if needed).
- **description**: Optional custom description of the song.
- **user_tags**: List of custom tags added by the user.
- **original_song_path**: Nullable original song URL (if any).
- **style_source_song_id**: Nullable ID for style transfer reference.
- **style_source_type**: Type of style source used (if any).
### Ethical note
We are releasing this dataset under cross-boarder and cross-continent laws and we are committing ourselves to EU act that allows collection of such metadata from public sources for scientific research purpose + the place where the server and IP are located.
We prohibit any commercial use and creation of derivative and share of this dataset. If anyone violates these terms will be liable under legal framework for violating copyright. We strictly provide this dataset under CC-by-nc-nd 4.0
很高兴在此分享来自全球规模最大的商业化文生音乐与歌词转歌曲(Lyric-to-Song, LTS)服务商之一Udio的2330万首人工智能生成歌曲。截至目前,这是已知规模最大的同类数据集,且附带精细完备的元数据。
本数据集由Sleeping AI团队的在研项目与即将发表的学术论文支持发布。
### 数据集包含哪些字段?
- **id**:分配给元数据条目的通用唯一识别码(UUID)
- **user_id**:生成或上传该歌曲的用户的通用唯一识别码(UUID)
- **artist**:创作者姓名
- **artist_image**:创作者个人主页头像的链接
- **title**:歌曲标题
- **created_at**:歌曲生成的时间戳(协调世界时UTC)
- **error_id**:用于错误追踪的可空字段,无错误时取值为null
- **error_type**:用于标识错误类型的可空字段
- **error_code**:用于标识错误的可空数值或字符串代码
- **generation_id**:用于内部追踪生成流程的通用唯一识别码(UUID)
- **image_path**:歌曲封面/配图的链接
- **lyrics**:歌曲歌词
- **prompt**:用于生成该歌曲的原始文本提示词
- **likes**:该歌曲获得的点赞数
- **plays**:该歌曲的总播放次数
- **published_at**:歌曲发布的时间戳(协调世界时UTC)
- **replaced_tags**:处理过程中被替换的标签,本数据集中取值为null
- **song_path**:可下载的`.mp3`音频文件的链接
- **tags**:与该歌曲关联的标签列表,涵盖曲风、情绪等
- **duration**:歌曲时长,单位为秒
- **video_path**:歌曲配套视频的链接,例如`.mp4`格式
- **error_detail**:用于描述具体错误的可选字段
- **finished**:布尔值,标识生成流程是否已完成
- **liked**:布尔值,标识当前用户是否点赞了该歌曲
- **disliked**:布尔值,标识当前用户是否点踩了该歌曲
- **publishable**:布尔值,标识该歌曲是否可公开分享
- **audio_conditioning_type**:可空字符串,用于描述特殊音频调节条件
- **attribution**:用于说明来源或归属的文本,如有需要
- **description**:用于描述该歌曲的可选自定义文本
- **user_tags**:用户自行添加的自定义标签列表
- **original_song_path**:可空字段,存储原始歌曲的链接,如存在
- **style_source_song_id**:风格迁移参考源歌曲的可空ID
- **style_source_type**:所用风格参考源的类型,如存在
### 伦理声明
本数据集的发布符合跨境及跨大洲相关法律要求,我们严格遵循欧盟相关法规,允许从公开来源采集此类元数据用于科学研究,并符合服务器所在地区与IP归属地的相关法规。
我们严禁任何商业用途、衍生创作以及本数据集的二次分发行为。任何违反本协议的主体都将承担侵犯版权相关的法律责任。本数据集严格按照知识共享署名-非商业性使用-禁止演绎4.0国际许可协议(CC BY-NC-ND 4.0)进行发布。
提供机构:
maas
创建时间:
2025-08-04



