five

cometadata/2025-08-datacite-normalized-affiliation-string-distribution

收藏
Hugging Face2025-11-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/cometadata/2025-08-datacite-normalized-affiliation-string-distribution
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: cc0-1.0 tags: - datacite - affiliations - normalization pretty_name: DataCite Normalized Affiliation Distribution task_categories: - data-analysis --- # DataCite Normalized Affiliation Distribution ## Summary `normalized_distribution.json` contains one JSON object per normalized affiliation string. It aggregates the total occurrence count, a ranked list of the raw affiliation strings that collapse into the normalized form, and the provider/client entities that asserted them. This dataset is derived from the August 2025 DataCite creator/contributor export. ## Structure ```json { "normalized": "example university", "total_count": 314, "affiliations": [ {"affiliation": "Example University", "occurrences": 300}, {"affiliation": "Example Univ.", "occurrences": 14} ], "providers": {"unique_total": 5, "counts": {"tib.example": 200, "cdr.sample": 114}}, "clients": {"unique_total": 4, "counts": {"example.client": 314}} } ``` ### Fields - `normalized` *(string)*: normalized ASCII/whitespace-stripped token. - `total_count` *(int)*: total occurrences across the snapshot. - `affiliations` *(array)*: ranked list of raw affiliation strings and their counts. - `providers` / `clients` *(object)*: `unique_total` plus per-ID counts.
提供机构:
cometadata
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作