five

freesound-laion-640k-commercial-16khz-full

收藏
魔搭社区2025-12-05 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-full
下载链接
链接失效反馈
官方服务:
资源简介:
# About this Repository This repository is the training split of [the complete FreeSound LAION 640k dataset](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k), limited only to licenses that permit commercial works, resampled to `16khz` using `torchaudio.transforms.Resample`. This is ideal for use cases where a variety of audio is desired but fidelity and labels are unnecessary, such as background audio for augmenting other datasets. ## Dataset Versions - *You are looking at the* **full** dataset which contains **403,146** unique sounds totaling **37.5 GB**. - The [large](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-large) dataset contains **200,000** unique sounds totaling **18.7 GB**. - The [medium](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-medium) dataset contains **100,000** unique sounds totaling **9.29 GB**. - The [small](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-small) dataset contains **50,000** unique sounds totaling **4.64 GB**. - The [tiny](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-tiny) dataset contains **20,000** unique sounds totaling **1.84 GB**. ## Sampling Method To generate the smaller datasets, the following method was applied: - Using the complete dataset's tag metadata, generate a feature vector for each sample. - Cluster the feature vectors using k-means sampling. - Sample from each cluster in a round-robin fasion until the desired dataset size is reached. ## What about download links? Links were ommitted for the sake of size, as they can be constructed from the data already present. To reconstruct a link, use the following format: `https://freesound.org/people/{username}/sound/{id}` # About this Dataset > LAION-Audio-630K is a large-scale audio-text dataset consisting of 633,526 pairs with the total duration of 4,325.39 hours. It contains audios of human activities, natural sounds and audio effects, consisting of 8 data sources (see the data source table below) from publicly available websites. We collect these datasets by downloading audios and relevant text descriptions. Based on our current knowledge, LAION-Audio-630K is the largest audio-text dataset publicly available and a magnitude larger than previous audio-text datasets (by 2022-11-05). > > [LAION-AI, github.com](https://github.com/LAION-AI/audio-dataset/blob/main/laion-audio-630k/) ## Acknowledgment The whole collection process as well as all usage of the LAION-Audio-630K are conducted by Germany non-profit pure research organization LAION. All contributors and collectors of the dataset are considered as open source contributors affiliated to LAION. These community contributors (Discord ids) include but not limited to: @marianna13#7139, @Chr0my#0173, @PiEquals4#1909, @Yuchen Hui#8574, @Antoniooooo#4758, @IYWO#9072, krishna#1648, @dicknascarsixtynine#3885, and @turian#1607. We would like to appreciate all of them for their efforts on the LAION-Audio-630k dataset. ## License - LAION dataset metadata is released under [The MIT License.](https://mit-license.org/) - Audio is released under one of four licenses: | License | URL | | ------- | --- | | CC0-1.0 | https://creativecommons.org/publicdomain/zero/1.0/ | | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ | | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ | | CC-Sampling+ | https://creativecommons.org/licenses/sampling+/1.0/ | **Please read the entirety of these licenses before deciding if you can use the audio for your project.**

# 本仓库说明 本仓库为完整FreeSound LAION 640k数据集(https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k)的训练子集,仅保留允许商用的授权音频,并通过`torchaudio.transforms.Resample`重采样至`16kHz`。 该数据集适用于仅需多样化音频、无需高保真度与标注信息的场景,例如用于扩充其他数据集的背景音频。 ## 数据集版本 - 您当前浏览的为**完整**数据集,包含403,146条独立音频,总容量为37.5 GB。 - 大型(large)数据集(链接:https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-large)包含200,000条独立音频,总容量为18.7 GB。 - 中型(medium)数据集(链接:https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-medium)包含100,000条独立音频,总容量为9.29 GB。 - 小型(small)数据集(链接:https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-small)包含50,000条独立音频,总容量为4.64 GB。 - 微型(tiny)数据集(链接:https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-tiny)包含20,000条独立音频,总容量为1.84 GB。 ## 采样方法 针对小型数据集的生成,采用如下流程: - 基于完整数据集的标签元数据,为每条样本生成特征向量。 - 使用k-means采样(k-means sampling)对特征向量进行聚类。 - 按轮询(round-robin)方式从每个聚类中采样,直至达到目标数据集规模。 ## 下载链接说明 出于缩减体积的考虑,本仓库未内置下载链接,此类链接可基于现有数据自行构建。链接的生成格式如下: `https://freesound.org/people/{用户名}/sound/{音频ID}` # 本数据集说明 > LAION-Audio-630K是一款大规模音频-文本数据集,包含633,526条音频-文本配对样本,总时长达4,325.39小时。该数据集涵盖人类活动音效、自然环境音效与音频特效,其数据来源于8个公开网站(详见下文数据源表格)。我们通过下载音频及对应文本描述的方式收集本数据集。据我们目前所知,截至2022年11月5日,LAION-Audio-630K是已公开的规模最大的音频-文本数据集,规模较此前的同类数据集高出数个数量级。 > > [LAION-AI,github.com](https://github.com/LAION-AI/audio-dataset/blob/main/laion-audio-630k/) ## 致谢 LAION-Audio-630K的全部收集流程与使用均由德国非营利纯研究组织LAION完成。本数据集的所有贡献者与收集者均视为隶属于LAION的开源贡献者。这些社区贡献者(Discord账号ID)包括但不限于:@marianna13#7139、@Chr0my#0173、@PiEquals4#1909、@Yuchen Hui#8574、@Antoniooooo#4758、@IYWO#9072、krishna#1648、@dicknascarsixtynine#3885以及@turian#1607。我们谨向所有为LAION-Audio-630K数据集付出努力的人员致以诚挚谢意。 ## 授权协议 - LAION数据集的元数据采用[MIT授权协议](https://mit-license.org/)发布。 - 音频文件采用以下四种授权协议之一发布: | 授权协议 | 链接 | | ------- | --- | | CC0-1.0 | https://creativecommons.org/publicdomain/zero/1.0/ | | CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ | | CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ | | CC-Sampling+ | https://creativecommons.org/licenses/sampling+/1.0/ | **在决定将音频用于您的项目之前,请务必完整阅读上述所有授权协议。**
提供机构:
maas
创建时间:
2025-03-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作