five

midah/model-dataset-licenses

收藏
Hugging Face2026-03-01 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/midah/model-dataset-licenses
下载链接
链接失效反馈
官方服务:
资源简介:
# HF Datasets License Catalog Catalog of ~885k Hugging Face datasets with their licenses, built from a single pass over `list_datasets()` (license from tags). No per-dataset card fetches. **Related:** [midah/hf-dataset-licenses](https://huggingface.co/datasets/midah/hf-dataset-licenses) (SPDX 700+ licenses) · [modelbiome/ai_ecosystem](https://huggingface.co/datasets/modelbiome/ai_ecosystem) (1.86M models — figures 05–07 use model↔dataset links) ## Dataset - **Repo**: [midah/hf-datasets-licenses](https://huggingface.co/datasets/midah/hf-datasets-licenses) ## Schema | Field | Type | Description | |-------|------|--------------| | dataset_id | string | HF dataset ID (e.g. `org/name`) | | license | string \| null | License from tags (`license:xxx`), null if absent | | author | string | Dataset author | | downloads | int | Total downloads | | likes | int | Likes | | private | bool | Private repo | | gated | bool | Gated access | | last_modified | string | ISO timestamp | ## Usage ```python from datasets import load_dataset ds = load_dataset("midah/hf-datasets-licenses") # Filter by license mit = ds["train"].filter(lambda x: x["license"] == "mit") ``` ## Figures ![Top licenses](01_datasets_top_licenses.png) ![License coverage](02_datasets_license_coverage.png) ![Licenses by downloads](03_datasets_licenses_by_downloads.png) ![Model-style licenses](04_datasets_model_licenses.png) ![Models top licenses](05_models_top_licenses.png) ![Top model→dataset pairs when different](06_model_dataset_pairs_when_different.png) ![Dataset mix cited by model license](06b_model_cites_dataset_mix.png) ![License consistency](07_model_dataset_license_consistency.png) Model ↔ dataset relationships from [modelbiome/ai_ecosystem](https://huggingface.co/datasets/modelbiome/ai_ecosystem): top cross-license pairs, conditional dataset mix per model license, and consistency. ## Source Built by `scripts/build_hf_datasets_licenses.py` from the Hugging Face Hub API. License from tags only; no per-dataset README parsing. ## License CC0-1.0.
提供机构:
midah
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作