five

Global Artificial Intelligence Indicator Database (GAID), 1998–2025 (Version 2)

收藏
DataONE2026-01-13 更新2026-01-24 收录
下载链接:
https://search.dataone.org/view/sha256:c3f74a67d007f9aca4d3d43e157c8e1078bb0f814518d61adb0bb89bb1bbaa44
下载链接
链接失效反馈
官方服务:
资源简介:
Overview: The Global Artificial Intelligence Indicator Database (GAID) Version 2.0 represents a significant expansion of the longitudinal panel dataset, providing the most comprehensive, harmonized overview of the global AI landscape. Spanning 1998 to 2025, GAID Version 2.0 integrates, standardizes, and surgically cleans high-fidelity indicators from eight additional premier AI monitoring authorities, including Epoch AI, UNESCO Global AI Ethics Observatory, MacroPolo, IEA, WIPO, and the World Bank. Surgical Data Quality & Integrity: Unlike raw index exports, GAID Version 2.0 has undergone a multi-stage \"surgical cleaning\" and metadata healing pipeline to ensure 100% data integrity. Key technical enhancements include: Harmonized Longitudinal Structure: Multi-source data consolidated into a \"Long Format\" (Tidy Data), optimized for R, Stata, Python, and SPSS. Universal Geographic Standardization: 259,546 observations across 227 countries and territories (expanded from 214 in Version 1.0) mapped to standardized ISO3 alpha-3 codes. Advanced Metadata Healing: 100% completeness across metadata fields (Source_File, Source_Type, Source_Year), ensuring full replicability. Unit Harmonization: Standardized formatting of economic indicators into legacy-aligned units (e.g., USD Billions, USD Thousands) and theoretical ranges ([0, 1] for Ratios; [0, 100] for Scores). Expanded Dataset Scope (Version 2.0): Temporal Range: 1998 – 2025 Geographic Scope: 227 Countries/Territories Indicator Density: 24,453 unique metrics Observation Count: 259,546 verified rows. New Thematic Domains Include: Technical Trends & Benchmarks (Epoch AI): State-of-the-art AI model performance across 39 benchmarks (MMLU, GSM8K, etc.), total training compute (FLOPs), and model parameter counts. AI Infrastructure & Energy (IEA & Epoch): National AI cluster power capacity (MW), data center hub capacity (Operating vs. Planned), and compute stock (H100 equivalents). Global AI Talent (MacroPolo): Top-tier researcher flows tracking undergraduate origin vs. graduate study and current work locations. Real-world Usage Polling (Epoch AI): Granular survey data on AI service adoption (ChatGPT, Claude, etc.), use-case frequency, and workplace tool provision. Ethics & Governance (UNESCO & World Bank): AI readiness assessment scores, GovTech maturity indices, and digital citizen engagement frameworks. Innovation (WIPO): National AI-related patent publication intensity. Technical Usage Note: Researchers must consult the accompanying codebook (w1_v2_CODEBOOK_MASTER_AI_DATA.pdf) for the Categorical Metric Dictionary. The documentation follows a domain-based table structure providing precise data ranges (Theoretical vs. Unbounded), units of measure, and original definitions for all 24,000+ metrics. Compilation Pipeline: This dataset was produced via a sequential four-stage pipeline: (1) master_compiler_FINAL.py, (2) master_compiler_v2.py, (3) fix_micronesia_country_names.py, and (4) heal_source_file_metadata.py.
创建时间:
2026-01-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作