Vulnerabilidad de Empleos a la Inteligencia Artificial en España: Dataset, Metodología y Dashboard Interactivo (v17)

Name: Vulnerabilidad de Empleos a la Inteligencia Artificial en España: Dataset, Metodología y Dashboard Interactivo (v17)
Creator: Zenodo
Published: 2026-03-21 03:42:36
License: 暂无描述

Zenodo2026-03-21 更新2026-05-26 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.19142170

下载链接

链接失效反馈

官方服务：

资源简介：

AI Exposure of Jobs in Spain — Complete Dataset, Methodology & Interactive Dashboard This deposit contains the complete dataset, methodology, and interactive visualisation tool for assessing the theoretical exposure of 502 Spanish occupations to artificial intelligence. The analysis covers 22.46 million workers (EPA Q4 2025, INE) and assigns each occupation a calibrated exposure score on a 0–10 scale, cross-referenced with salary data, EU AI Act risk classification, and impact typology. The interactive dashboard is available at: https://alvarodenicolas.com/interactive/empleos-ia/index.html Dataset The core dataset (spain_502_v6_sectorfixed.json) contains 502 records corresponding to the complete CNO-11 occupational taxonomy (SEPE expansion). Each record includes 11 fields: Field Type Description cno string 4-digit CNO-11 occupation code nombre string Official occupation name (Spanish) sector string Assigned economic sector (12 categories) empleo integer Estimated employment (EPA Q4 2025, redistributed via Census 2021 weights) salario_medio_eur float Estimated mean gross annual salary (EUR), based on INE EES 2023 + educational premia + FR/PT proxies (128 unique values; MAPE 4.96% vs 16 INE reference groups) vulnerabilidad_ia_score float AI exposure score (0–10), calibrated with 5 Spain-specific structural factors eu_ai_act string EU AI Act risk classification: "Alto riesgo" (Annex III), "Riesgo limitado", or "Riesgo mínimo" tipo_impacto string Impact typology: "Sustitución" (29 occupations), "Híbrido" (184), or "Aumentación" (289) justificacion string 3–4 sentence justification in Spanish explaining the automation vector and human-protective factors census_2021_employed float Census 2021 employment figure used for intra-group weighting employment_method string Employment estimation method identifier Sector taxonomy (12 categories): Administración, Agricultura, Artesanía y manufactura, Construcción, Dirección, Elementales, Industria, Industria alimentaria, Militar, Profesionales, Servicios, Técnicos apoyo. Note: v6 reclassifies 12 occupations (CNO 76xx) from "Construcción" to "Artesanía y manufactura" and separates food-industry occupations (CNO 77x) into "Industria alimentaria", correcting sector assignments from earlier versions. Key Findings Indicator Value Note Occupations analysed 502 Complete CNO-11 (SEPE taxonomy) Workers represented 22.46 million EPA Q4 2025 (final data) Weighted mean exposure 4.0 / 10 Calibrated for Spain (5 factors); unweighted mean: 4.3 High-exposure occupations (score ≥7) 122 occupations 4,690,622 jobs (20.9% of total employment) Wage-exposure index 278,200M EUR Employment × Salary × Score/10 — a weighted index, not a prediction of wage losses Score range 1.0 – 9.0 1.0: hairdressers, cleaners, firefighters; 9.0: data-entry clerks Salary range 12,985 – 79,282 EUR/year 128 unique estimated values Inter-model validation (r) 0.715 100 occupations, Gemini 2.5 Pro vs GPT-4o blind Salary validation (MAPE) 4.96% 16 INE EES 2023 groups, post-correction Employment validation (1-digit) ±0.00% EPA Q4 2025 exact (API Tempus table 65134) Methodology Exposure scores were generated by Gemini 2.5 Pro (temperature 0.2, structured rubric prompts) following the methodological lineage of Brynjolfsson et al. (2018) and Eloundou et al. (2023), adapted to the Spanish CNO-11 taxonomy with structural calibration. Five Spain-specific calibration factors are applied: DESI digitalisation index — DESI 2023, 69.8 points (3rd in EU). Note: the European Commission no longer publishes the composite DESI score after 2023. Spain ranks 11th in enterprise digital technology integration; only 8–21% of Spanish firms use AI depending on source (INE TIC 2025 ~21%, Banco de España 2025 ~20%). Sector moderation factor: 0.80 (agriculture) to 0.95 (technology/banking). Services sector weight — 74% of GDP (vs 68% EU average); tourism 12.4% of GDP. Employment protection — 3rd strictest in OECD. Unfair dismissal severance: 33 days/year (max 24 monthly payments). Friction factor: 1–5% by sector. EU AI Act — Regulation (EU) 2024/1689 classifies AI systems by use-case context, not occupations. Annex III high-risk contexts map to 71 occupations (5.13M workers). Moderation factor: 2–8% for high-risk categories. AESIA supervision — Spain is the first EU country with an operational national AI supervisory agency (A Coruña, Real Decreto 729/2023). Fines up to 35M EUR or 7% of global turnover. The combined calibration produces an 11–12% reduction from base LLM scores. Formula: score_adjusted = score_base × factor_DESI × (1 - regulatory_friction) × (1 - labour_friction) Important: This analysis measures the technical capability of AI to perform occupational tasks, not the economic viability of substitution (see note [36]). Acemoglu (2024) estimates only ~4.6% of tasks are currently automatable with positive ROI. In Spain, where median wages (~28,000 EUR) are below the US market where most AI tools are priced, the economic viability threshold for substitution is likely higher. Employment data: EPA Q4 2025 microdata (INE, published 27/01/2026), 22,463,286 total employed. EPA publishes employment at 1-digit CNO level only; 4-digit figures are proportional estimates redistributed using Census 2021 structural weights (145 subgroups at 3-digit CNO). 4-digit employment figures are estimates, not observed data. Salary data: Encuesta de Estructura Salarial 2023 (INE, table 28186) at 2-digit CNO level, adjusted with educational premia (INE) and intra-group variance proxies from France (INSEE) and Portugal (INE-PT). Post-correction MAPE: 4.96% against 16 INE reference groups. Self-employed workers (~3.3M) are excluded from the salary survey by design. Occupation-level scoring (deliberate design choice): Scores are assigned at the occupation level rather than at the individual task level. The CNO-11 lacks a task inventory equivalent to ONET. Scoring at occupation level sacrifices intra-occupational granularity but avoids the cumulative noise of an ONET → ISCO-08 → CNO-11 crosswalk across 19,000 tasks and two intermediate classification systems. The impact typology (Sustitución / Híbrido / Aumentación) provides the directional signal that a single score does not capture. Validation Stack Employment validation (1-digit) EPA Q4 2025 totals reproduced with ±0.00% deviation (API Tempus, table 65134). Maximum difference: 47 persons over 22.46M. AI score validation (inter-model) 100 stratified occupations blind-rescored by GPT-4o without access to original Gemini scores. Results: Metric Value Interpretation Pearson correlation r = 0.715 Good (inter-rater reliability) Intraclass correlation ICC(2,1) = 0.701 Good — consistent agreement between models Weighted kappa κw = 0.667 Substantial agreement (Landis-Koch convention) Systematic bias (GPT − Gemini) +0.28 points GPT scores slightly higher Mean absolute deviation 1.0 point Excellent on a 0–10 scale Agreement within ±1.0 points 61% Narrow agreement Agreement within ±2.0 points 84% Broad agreement Bland-Altman 95% limits −2.77 to +3.33 No proportional bias confirmed Disagreement pattern: GPT compresses toward the centre, scoring manual/physical occupations higher (+1.07 in the 0–3 band) and knowledge occupations lower (−1.00 in the 7+ band). The 16 large disagreements (>2.0 points) cluster in two interpretable groups: (1) GPT overestimates industrial automation potential (CNO 81xx/82xx), (2) Gemini overestimates the digital component of niche professions (notaries, actors, oenologists). Salary validation MAPE 4.96% against 16 INE EES 2023 groups (post-correction). All deviations under 10%. Three targeted corrections applied: Group I (Protection & security, −36.4% → −4.4%), Group M (Fixed machinery operators, −12.0% → −1.4%), Group H (Health & care, +9.7% → +5.2%). Validated by Manus AI. Adversarial multi-model review The methodology document was subjected to adversarial review by 7 independent AI models (Grok ×2, Perplexity ×2, Manus ×2, Gemini). Consensus corrections incorporated: relabelling of wage-exposure index, DESI 2024→2023 correction, EU AI Act reclassification (elimination of "prohibited" occupation-level category), sector corrections (v18: reclassification of 12 occupations CNO 76xx from "Construcción" to "Artesanía y manufactura"), and reconciliation of inconsistent figures. Sensitivity Analysis Scenario Weighted mean High exposure (≥7) Wage-exposure index Current calibration (base) 4.0 / 10 20.9% 278,200M EUR All factors −20% ~4.9 / 10 Non-linear estimate ~333,800M EUR All factors +20% ~3.2 / 10 Non-linear estimate ~222,600M EUR Limitations Exposure scores are theoretical estimates, not predictions of job displacement. Empirical evidence (Anthropic Economic Index, February 2026) shows significant gaps between theoretical exposure (~94%) and observed adoption (~33%) in computer/mathematical occupations. Acemoglu (2024) estimates only ~4.6% of tasks are currently automatable with positive ROI; in Spain, lower wages relative to the US (where most AI tools are priced) likely raise the economic viability threshold further. 4-digit employment figures are proportional estimates, not observed data. Deviations at 2-digit level against EPA published totals range from ±0% to ±540% due to structural changes between Census 2021 and EPA 2025. Calibration factors are expert judgement without empirical back-testing. Sensitivity analysis (±20%) shows weighted mean exposure shifting from 3.3 to 4.9 and wage-exposure index from 219B to 329B EUR. France/Portugal salary proxies assume structural similarity among southern European economies. Not empirically validated at individual occupation level. MCVL (Muestra Continua de Vidas Laborales) is an identified but unused validation source. Scores were generated in a single pass per model. Estimated intra-model reproducibility: ±0.5 points. The analysis is static (March 2026 snapshot) and does not model job creation by AI, regional variation, or part-time/full-time distinctions. Occupation-level scoring sacrifices intra-occupational granularity. The CNO-11 lacks a task inventory equivalent to O*NET; the impact typology partially compensates. Available but unintegrated dimensions: (a) Regional distribution (CCAA) — EPA microdata extracted for 19 autonomous communities, available for future NUTS-2 analysis; the OECD (2024) documents that GenAI will exacerbate existing regional disparities. (b) Gender decomposition — SEPE publishes a gender gap occupational catalogue; female concentration in administrative (high exposure) and care (low exposure) roles suggests unequal impacts. (c) Informal economy — estimated at 18–20% of GDP (CEPR/Banco de España), concentrated in low-exposure sectors; its exclusion produces a mild upward bias in the national exposure mean. (d) Part-time/full-time distinction — ~2.5M part-time workers in Spain; the wage-exposure index implicitly assumes full-time equivalence. Interactive Dashboard The dashboard (single-page React application) provides four views: Treemap — sector-level aggregation with drill-down to individual occupations; rectangle area proportional to employment, colour indicates exposure score. Detailed treemap — occupation-level rectangles nested within sector groups. Scatter plot — salary (y-axis) vs. AI exposure (x-axis) with regression trend line; bubble size proportional to employment. Sortable list — tabular view with score, employment, salary, sector, and EU AI Act classification. Filters: sector selector, minimum/maximum score range sliders, sort by employment/salary/risk. Detail panel: click any occupation for full profile including justification text, EU AI Act classification, impact typology, and wage-exposure sub-index. Comparative Positioning Dimension This analysis willrobotstakemyjob.com OECD AI Exposure ILO GenAI Index Taxonomy CNO-11 (502, Spain) SOC/O*NET (702, US) ~400 ISCO (cross-country) ISCO (cross-country) Scoring LLM + 5 calibration factors Frey & Osborne (2013) Expert + O*NET tasks GPT-4 task scoring Inter-model validation r=0.715, κw=0.667, Bland-Altman None published Expert panel (no LLM cross-check) None published Regulatory mapping EU AI Act (3 risk levels) None None None Salary cross-reference Yes (128 values, MAPE 4.96%) Yes (BLS direct) No No US Comparative Analysis A parallel reference analysis for the US labour market (Andrej Karpathy, 'Jobs', 2025–2026) uses BLS/O*NET data. Key structural differences: Parameter US Spain Primary cause Mean exposure ~4.6 4.0 Physical services weight + 5-factor calibration % high exposure (≥7) ~27% 20.9% Smaller knowledge economy share Regulatory classification Not included 3 EU AI Act levels No US federal AI framework Labour friction factor Not applied 1–5% by sector OECD 3rd strictest employment protection Salary granularity ~800 direct values 128 adjusted values INE publishes at 2-digit level Employment granularity Direct per occupation Distributed from 1-digit EPA anonymises CNO at 1-digit OECD contextualisation: The 28% "at risk" figure (OECD Employment Outlook 2024) refers to all automation technologies, not exclusively AI. OECD AI-specific figures for Spain: 5.9% high automation risk from AI; 27.4% GenAI exposure. This analysis's 20.9% (score ≥7) measures calibrated theoretical exposure to AI broadly — not directly comparable to any single OECD figure. Technical Notes The methodology document (v18) contains 39 technical notes grouped by topic: occupational inventory (notes 1–2), employment data (3–7), salary data (8–12), AI scoring (13–17, 27, 30), EU AI Act classification (18–20, 28), visualisation (21–22), limitations (23–26), validation (29, 31–35), and context, comparability & pending dimensions (36–39). Key notes: Employment: EPA publishes CNO at 1-digit only; 4-digit figures are Census 2021-weighted proportional estimates (note 3–4). Salary: EES 2023 reference year is 2022; no temporal deflator applied (note 12). Self-employed excluded by design (note 11). Scoring: Single-pass LLM generation; estimated reproducibility ±0.5 points (note 27). Few-shot calibration anchors: score 1 (bricklayer), score 5 (driver), score 9 (telemarketer) (note 30). EU AI Act: Art. 5 prohibits certain AI practices, not professions; no "prohibited" category at occupation level (note 28). Art. 6(3) exemptions not captured (note 20). The Anthropic Economic Index (February 2026) theory-practice gap suggests calibration factors may be insufficient to capture the full adoption gap (note 32). Economic viability: Acemoglu (2024) estimates ~4.6% of tasks automatable with positive ROI; Spanish wage levels likely raise this threshold further (note 36). AIOE index (Felten et al. 2021): Direct comparison not possible due to taxonomy differences (ONET skills vs CNO-11 occupations); a CNO-11 → ISCO-08 → SOC → ONET crosswalk is identified as a future validation path (note 37). Occupation-level scoring is a deliberate design choice, not an omission; CNO-11 lacks an O*NET-equivalent task inventory (note 38). Available but unintegrated dimensions: regional (CCAA), gender, informal economy (18–20% GDP), part-time (~2.5M workers) (note 39). Keywords artificial intelligence, labour market, employment, Spain, AI exposure, occupational risk, EU AI Act, CNO-11, EPA, automation, interactive dashboard, inter-model validation, Bland-Altman, treemap, wage-exposure index, AESIA License Creative Commons Attribution 4.0 International (CC BY 4.0) Language Spanish (dataset, justifications, dashboard UI); English (this description, methodology notes bilingual) Resource Type Dataset + Interactive Visualisation + Methodology Document Related Identifiers https://alvarodenicolas.com/interactive/empleos-ia/index.html (IsSupplementedBy — interactive dashboard) Brynjolfsson, E., Mitchell, T. & Rock, D. (2018). "What Can Machines Learn, and What Does It Mean for Occupations and the Economy?" AEA Papers and Proceedings, 108:43-47. (References) Eloundou, T., Manning, S., Mishkin, P. & Rock, D. (2023). "GPTs are GPTs." OpenAI/UPenn. arXiv:2303.10130v5. (References) Frey, C. B. & Osborne, M. A. (2017). "The Future of Employment." Technological Forecasting and Social Change, 114:254-280. (References) Regulation (EU) 2024/1689 (EU AI Act). Annex III, Arts. 5 and 6. (References) Anthropic. The Anthropic Economic Index. February 2026. (References) Nedelkoska, L. & Quintini, G. (2018). "Automation, skills use and training." OECD Social, Employment and Migration Working Papers, No. 202. (References) Acemoglu, D. (2024). "The Simple Macroeconomics of AI." Economic Policy. (References) Felten, E., Raj, M. & Seamans, R. (2021). "Occupational, industry, and geographic exposure to AI." Strategic Management Journal, 42(12):2195-2217. (References)

提供机构：

Zenodo

创建时间：

2026-03-21