freesound-laion-640k-commercial-16khz-large
收藏魔搭社区2025-11-12 更新2025-03-22 收录
下载链接:
https://modelscope.cn/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-large
下载链接
链接失效反馈官方服务:
资源简介:
# About this Repository
This repository is the training split of [the complete FreeSound LAION 640k dataset](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k), limited only to licenses that permit commercial works, resampled to `16khz` using `torchaudio.transforms.Resample`.
This is ideal for use cases where a variety of audio is desired but fidelity and labels are unnecessary, such as background audio for augmenting other datasets.
## Dataset Versions
- The [full](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-full) dataset contains **403,146** unique sounds totaling **37.5 GB**.
- *You are looking at* the **large** dataset which contains **200,000** unique sounds totaling **18.7 GB**.
- The [medium](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-medium) dataset contains **100,000** unique sounds totaling **9.29 GB**.
- The [small](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-small) dataset contains **50,000** unique sounds totaling **4.64 GB**.
- The [tiny](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-tiny) dataset contains **20,000** unique sounds totaling **1.84 GB**.
## Sampling Method
To generate the smaller datasets, the following method was applied:
- Using the complete dataset's tag metadata, generate a feature vector for each sample.
- Cluster the feature vectors using k-means sampling.
- Sample from each cluster in a round-robin fasion until the desired dataset size is reached.
## What about download links?
Links were ommitted for the sake of size, as they can be constructed from the data already present. To reconstruct a link, use the following format:
`https://freesound.org/people/{username}/sound/{id}`
# About this Dataset
> LAION-Audio-630K is a large-scale audio-text dataset consisting of 633,526 pairs with the total duration of 4,325.39 hours. It contains audios of human activities, natural sounds and audio effects, consisting of 8 data sources (see the data source table below) from publicly available websites. We collect these datasets by downloading audios and relevant text descriptions. Based on our current knowledge, LAION-Audio-630K is the largest audio-text dataset publicly available and a magnitude larger than previous audio-text datasets (by 2022-11-05).
>
> [LAION-AI, github.com](https://github.com/LAION-AI/audio-dataset/blob/main/laion-audio-630k/)
## Acknowledgment
The whole collection process as well as all usage of the LAION-Audio-630K are conducted by Germany non-profit pure research organization LAION. All contributors and collectors of the dataset are considered as open source contributors affiliated to LAION. These community contributors (Discord ids) include but not limited to: @marianna13#7139, @Chr0my#0173, @PiEquals4#1909, @Yuchen Hui#8574, @Antoniooooo#4758, @IYWO#9072, krishna#1648, @dicknascarsixtynine#3885, and @turian#1607. We would like to appreciate all of them for their efforts on the LAION-Audio-630k dataset.
## License
- LAION dataset metadata is released under [The MIT License.](https://mit-license.org/)
- Audio is released under one of four licenses:
| License | URL |
| ------- | --- |
| CC0-1.0 | https://creativecommons.org/publicdomain/zero/1.0/ |
| CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ |
| CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ |
| CC-Sampling+ | https://creativecommons.org/licenses/sampling+/1.0/ |
**Please read the entirety of these licenses before deciding if you can use the audio for your project.**
# 关于本仓库
本仓库为[完整FreeSound LAION 640k数据集](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k)的训练子集,仅包含允许商用的授权音频,并通过`torchaudio.transforms.Resample`重采样至`16kHz`。
本数据集适用于需要多样化音频但无需高保真度与标签的场景,例如用于扩充其他数据集的背景音频。
## 数据集版本
- [完整版本](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-full)数据集包含**403,146**条独立音频,总容量达**37.5 GB**。
- *您当前查看的为* **大尺寸版本**数据集,包含**200,000**条独立音频,总容量达**18.7 GB**。
- [中尺寸版本](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-medium)数据集包含**100,000**条独立音频,总容量达**9.29 GB**。
- [小尺寸版本](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-small)数据集包含**50,000**条独立音频,总容量达**4.64 GB**。
- [极小尺寸版本](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k-commercial-16khz-tiny)数据集包含**20,000**条独立音频,总容量达**1.84 GB**。
## 采样方法
为生成更小尺寸的数据集,采用如下流程:
1. 基于完整数据集的标签元数据,为每个样本生成特征向量;
2. 使用k-means采样对特征向量进行聚类;
3. 以轮询(round-robin)方式从每个聚类中采样,直至达到目标数据集规模。
## 如何获取下载链接?
为控制仓库体积,未直接提供下载链接,用户可通过现有数据自行构建链接。链接构建格式如下:
`https://freesound.org/people/{username}/sound/{id}`
# 关于本数据集
LAION-Audio-630K是一款大规模音频-文本数据集,包含633,526条音频-文本配对数据,总时长达4,325.39小时。数据集涵盖人类活动音频、自然音效与音频特效,其数据来源包含8个公开网站(详见下文数据源表格)。我们通过下载音频及相关文本描述的方式收集本数据集。据我们所知,截至2022年11月5日,LAION-Audio-630K是目前公开可用的规模最大的音频-文本数据集,其规模较此前的音频-文本数据集高出数个数量级。
来源:[LAION-AI, github.com](https://github.com/LAION-AI/audio-dataset/blob/main/laion-audio-630k/)
## 致谢
LAION-Audio-630K的全部收集流程与使用均由德国非营利纯研究组织LAION主导。本数据集的所有贡献者与收集者均视为隶属于LAION的开源贡献者。这些社区贡献者(Discord账号ID)包括但不限于:@marianna13#7139、@Chr0my#0173、@PiEquals4#1909、@Yuchen Hui#8574、@Antoniooooo#4758、@IYWO#9072、krishna#1648、@dicknascarsixtynine#3885以及@turian#1607。我们谨对所有为LAION-Audio-630K数据集付出努力的人员表示诚挚感谢。
## 授权协议
- LAION数据集元数据采用[MIT授权协议](https://mit-license.org/)发布。
- 音频文件采用以下四种授权协议之一发布:
| 授权协议 | 链接 |
| ------- | --- |
| CC0-1.0 | https://creativecommons.org/publicdomain/zero/1.0/ |
| CC-BY 4.0 | https://creativecommons.org/licenses/by/4.0/ |
| CC-BY 3.0 | https://creativecommons.org/licenses/by/3.0/ |
| CC-Sampling+ | https://creativecommons.org/licenses/sampling+/1.0/ |
**在决定将音频用于您的项目之前,请务必完整阅读上述所有授权协议。**
提供机构:
maas
创建时间:
2025-03-18
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是FreeSound LAION 640k数据集的训练分割,仅包含允许商业使用的音频,并已重采样至16kHz。它包含200,000个独特声音,总计18.7GB,适用于需要多样音频但无需高保真或标签的场景,如数据增强。
以上内容由遇见数据集搜集并总结生成



