Vulnerabilidad de Empleos a la Inteligencia Artificial en España: Dataset, Metodología y Dashboard Interactivo (v17)
收藏Zenodo2026-03-21 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19142170
下载链接
链接失效反馈官方服务:
资源简介:
AI Exposure of Jobs in Spain — Complete Dataset, Methodology & Interactive Dashboard
This deposit contains the complete dataset, methodology, and interactive visualisation tool for assessing the theoretical exposure of 502 Spanish occupations to artificial intelligence. The analysis covers 22.46 million workers (EPA Q4 2025, INE) and assigns each occupation a calibrated exposure score on a 0–10 scale, cross-referenced with salary data, EU AI Act risk classification, and impact typology.
The interactive dashboard is available at: https://alvarodenicolas.com/interactive/empleos-ia/index.html
Dataset
The core dataset (spain_502_v6_sectorfixed.json) contains 502 records corresponding to the complete CNO-11 occupational taxonomy (SEPE expansion). Each record includes 11 fields:
Field
Type
Description
cno
string
4-digit CNO-11 occupation code
nombre
string
Official occupation name (Spanish)
sector
string
Assigned economic sector (12 categories)
empleo
integer
Estimated employment (EPA Q4 2025, redistributed via Census 2021 weights)
salario_medio_eur
float
Estimated mean gross annual salary (EUR), based on INE EES 2023 + educational premia + FR/PT proxies (128 unique values; MAPE 4.96% vs 16 INE reference groups)
vulnerabilidad_ia_score
float
AI exposure score (0–10), calibrated with 5 Spain-specific structural factors
eu_ai_act
string
EU AI Act risk classification: "Alto riesgo" (Annex III), "Riesgo limitado", or "Riesgo mínimo"
tipo_impacto
string
Impact typology: "Sustitución" (29 occupations), "Híbrido" (184), or "Aumentación" (289)
justificacion
string
3–4 sentence justification in Spanish explaining the automation vector and human-protective factors
census_2021_employed
float
Census 2021 employment figure used for intra-group weighting
employment_method
string
Employment estimation method identifier
Sector taxonomy (12 categories): Administración, Agricultura, Artesanía y manufactura, Construcción, Dirección, Elementales, Industria, Industria alimentaria, Militar, Profesionales, Servicios, Técnicos apoyo. Note: v6 reclassifies 12 occupations (CNO 76xx) from "Construcción" to "Artesanía y manufactura" and separates food-industry occupations (CNO 77x) into "Industria alimentaria", correcting sector assignments from earlier versions.
Key Findings
Indicator
Value
Note
Occupations analysed
502
Complete CNO-11 (SEPE taxonomy)
Workers represented
22.46 million
EPA Q4 2025 (final data)
Weighted mean exposure
4.0 / 10
Calibrated for Spain (5 factors); unweighted mean: 4.3
High-exposure occupations (score ≥7)
122 occupations
4,690,622 jobs (20.9% of total employment)
Wage-exposure index
278,200M EUR
Employment × Salary × Score/10 — a weighted index, not a prediction of wage losses
Score range
1.0 – 9.0
1.0: hairdressers, cleaners, firefighters; 9.0: data-entry clerks
Salary range
12,985 – 79,282 EUR/year
128 unique estimated values
Inter-model validation (r)
0.715
100 occupations, Gemini 2.5 Pro vs GPT-4o blind
Salary validation (MAPE)
4.96%
16 INE EES 2023 groups, post-correction
Employment validation (1-digit)
±0.00%
EPA Q4 2025 exact (API Tempus table 65134)
Methodology
Exposure scores were generated by Gemini 2.5 Pro (temperature 0.2, structured rubric prompts) following the methodological lineage of Brynjolfsson et al. (2018) and Eloundou et al. (2023), adapted to the Spanish CNO-11 taxonomy with structural calibration. Five Spain-specific calibration factors are applied:
DESI digitalisation index — DESI 2023, 69.8 points (3rd in EU). Note: the European Commission no longer publishes the composite DESI score after 2023. Spain ranks 11th in enterprise digital technology integration; only 8–21% of Spanish firms use AI depending on source (INE TIC 2025 ~21%, Banco de España 2025 ~20%). Sector moderation factor: 0.80 (agriculture) to 0.95 (technology/banking).
Services sector weight — 74% of GDP (vs 68% EU average); tourism 12.4% of GDP.
Employment protection — 3rd strictest in OECD. Unfair dismissal severance: 33 days/year (max 24 monthly payments). Friction factor: 1–5% by sector.
EU AI Act — Regulation (EU) 2024/1689 classifies AI systems by use-case context, not occupations. Annex III high-risk contexts map to 71 occupations (5.13M workers). Moderation factor: 2–8% for high-risk categories.
AESIA supervision — Spain is the first EU country with an operational national AI supervisory agency (A Coruña, Real Decreto 729/2023). Fines up to 35M EUR or 7% of global turnover.
The combined calibration produces an 11–12% reduction from base LLM scores. Formula: score_adjusted = score_base × factor_DESI × (1 - regulatory_friction) × (1 - labour_friction)
Important: This analysis measures the technical capability of AI to perform occupational tasks, not the economic viability of substitution (see note [36]). Acemoglu (2024) estimates only ~4.6% of tasks are currently automatable with positive ROI. In Spain, where median wages (~28,000 EUR) are below the US market where most AI tools are priced, the economic viability threshold for substitution is likely higher.
Employment data: EPA Q4 2025 microdata (INE, published 27/01/2026), 22,463,286 total employed. EPA publishes employment at 1-digit CNO level only; 4-digit figures are proportional estimates redistributed using Census 2021 structural weights (145 subgroups at 3-digit CNO). 4-digit employment figures are estimates, not observed data.
Salary data: Encuesta de Estructura Salarial 2023 (INE, table 28186) at 2-digit CNO level, adjusted with educational premia (INE) and intra-group variance proxies from France (INSEE) and Portugal (INE-PT). Post-correction MAPE: 4.96% against 16 INE reference groups. Self-employed workers (~3.3M) are excluded from the salary survey by design.
Occupation-level scoring (deliberate design choice): Scores are assigned at the occupation level rather than at the individual task level. The CNO-11 lacks a task inventory equivalent to ONET. Scoring at occupation level sacrifices intra-occupational granularity but avoids the cumulative noise of an ONET → ISCO-08 → CNO-11 crosswalk across 19,000 tasks and two intermediate classification systems. The impact typology (Sustitución / Híbrido / Aumentación) provides the directional signal that a single score does not capture.
Validation Stack
Employment validation (1-digit)
EPA Q4 2025 totals reproduced with ±0.00% deviation (API Tempus, table 65134). Maximum difference: 47 persons over 22.46M.
AI score validation (inter-model)
100 stratified occupations blind-rescored by GPT-4o without access to original Gemini scores. Results:
Metric
Value
Interpretation
Pearson correlation
r = 0.715
Good (inter-rater reliability)
Intraclass correlation
ICC(2,1) = 0.701
Good — consistent agreement between models
Weighted kappa
κw = 0.667
Substantial agreement (Landis-Koch convention)
Systematic bias (GPT − Gemini)
+0.28 points
GPT scores slightly higher
Mean absolute deviation
1.0 point
Excellent on a 0–10 scale
Agreement within ±1.0 points
61%
Narrow agreement
Agreement within ±2.0 points
84%
Broad agreement
Bland-Altman 95% limits
−2.77 to +3.33
No proportional bias confirmed
Disagreement pattern: GPT compresses toward the centre, scoring manual/physical occupations higher (+1.07 in the 0–3 band) and knowledge occupations lower (−1.00 in the 7+ band). The 16 large disagreements (>2.0 points) cluster in two interpretable groups: (1) GPT overestimates industrial automation potential (CNO 81xx/82xx), (2) Gemini overestimates the digital component of niche professions (notaries, actors, oenologists).
Salary validation
MAPE 4.96% against 16 INE EES 2023 groups (post-correction). All deviations under 10%. Three targeted corrections applied: Group I (Protection & security, −36.4% → −4.4%), Group M (Fixed machinery operators, −12.0% → −1.4%), Group H (Health & care, +9.7% → +5.2%). Validated by Manus AI.
Adversarial multi-model review
The methodology document was subjected to adversarial review by 7 independent AI models (Grok ×2, Perplexity ×2, Manus ×2, Gemini). Consensus corrections incorporated: relabelling of wage-exposure index, DESI 2024→2023 correction, EU AI Act reclassification (elimination of "prohibited" occupation-level category), sector corrections (v18: reclassification of 12 occupations CNO 76xx from "Construcción" to "Artesanía y manufactura"), and reconciliation of inconsistent figures.
Sensitivity Analysis
Scenario
Weighted mean
High exposure (≥7)
Wage-exposure index
Current calibration (base)
4.0 / 10
20.9%
278,200M EUR
All factors −20%
~4.9 / 10
Non-linear estimate
~333,800M EUR
All factors +20%
~3.2 / 10
Non-linear estimate
~222,600M EUR
Limitations
Exposure scores are theoretical estimates, not predictions of job displacement. Empirical evidence (Anthropic Economic Index, February 2026) shows significant gaps between theoretical exposure (~94%) and observed adoption (~33%) in computer/mathematical occupations. Acemoglu (2024) estimates only ~4.6% of tasks are currently automatable with positive ROI; in Spain, lower wages relative to the US (where most AI tools are priced) likely raise the economic viability threshold further.
4-digit employment figures are proportional estimates, not observed data. Deviations at 2-digit level against EPA published totals range from ±0% to ±540% due to structural changes between Census 2021 and EPA 2025.
Calibration factors are expert judgement without empirical back-testing. Sensitivity analysis (±20%) shows weighted mean exposure shifting from 3.3 to 4.9 and wage-exposure index from 219B to 329B EUR.
France/Portugal salary proxies assume structural similarity among southern European economies. Not empirically validated at individual occupation level. MCVL (Muestra Continua de Vidas Laborales) is an identified but unused validation source.
Scores were generated in a single pass per model. Estimated intra-model reproducibility: ±0.5 points.
The analysis is static (March 2026 snapshot) and does not model job creation by AI, regional variation, or part-time/full-time distinctions.
Occupation-level scoring sacrifices intra-occupational granularity. The CNO-11 lacks a task inventory equivalent to O*NET; the impact typology partially compensates.
Available but unintegrated dimensions: (a) Regional distribution (CCAA) — EPA microdata extracted for 19 autonomous communities, available for future NUTS-2 analysis; the OECD (2024) documents that GenAI will exacerbate existing regional disparities. (b) Gender decomposition — SEPE publishes a gender gap occupational catalogue; female concentration in administrative (high exposure) and care (low exposure) roles suggests unequal impacts. (c) Informal economy — estimated at 18–20% of GDP (CEPR/Banco de España), concentrated in low-exposure sectors; its exclusion produces a mild upward bias in the national exposure mean. (d) Part-time/full-time distinction — ~2.5M part-time workers in Spain; the wage-exposure index implicitly assumes full-time equivalence.
Interactive Dashboard
The dashboard (single-page React application) provides four views:
Treemap — sector-level aggregation with drill-down to individual occupations; rectangle area proportional to employment, colour indicates exposure score.
Detailed treemap — occupation-level rectangles nested within sector groups.
Scatter plot — salary (y-axis) vs. AI exposure (x-axis) with regression trend line; bubble size proportional to employment.
Sortable list — tabular view with score, employment, salary, sector, and EU AI Act classification.
Filters: sector selector, minimum/maximum score range sliders, sort by employment/salary/risk. Detail panel: click any occupation for full profile including justification text, EU AI Act classification, impact typology, and wage-exposure sub-index.
Comparative Positioning
Dimension
This analysis
willrobotstakemyjob.com
OECD AI Exposure
ILO GenAI Index
Taxonomy
CNO-11 (502, Spain)
SOC/O*NET (702, US)
~400 ISCO (cross-country)
ISCO (cross-country)
Scoring
LLM + 5 calibration factors
Frey & Osborne (2013)
Expert + O*NET tasks
GPT-4 task scoring
Inter-model validation
r=0.715, κw=0.667, Bland-Altman
None published
Expert panel (no LLM cross-check)
None published
Regulatory mapping
EU AI Act (3 risk levels)
None
None
None
Salary cross-reference
Yes (128 values, MAPE 4.96%)
Yes (BLS direct)
No
No
US Comparative Analysis
A parallel reference analysis for the US labour market (Andrej Karpathy, 'Jobs', 2025–2026) uses BLS/O*NET data. Key structural differences:
Parameter
US
Spain
Primary cause
Mean exposure
~4.6
4.0
Physical services weight + 5-factor calibration
% high exposure (≥7)
~27%
20.9%
Smaller knowledge economy share
Regulatory classification
Not included
3 EU AI Act levels
No US federal AI framework
Labour friction factor
Not applied
1–5% by sector
OECD 3rd strictest employment protection
Salary granularity
~800 direct values
128 adjusted values
INE publishes at 2-digit level
Employment granularity
Direct per occupation
Distributed from 1-digit
EPA anonymises CNO at 1-digit
OECD contextualisation: The 28% "at risk" figure (OECD Employment Outlook 2024) refers to all automation technologies, not exclusively AI. OECD AI-specific figures for Spain: 5.9% high automation risk from AI; 27.4% GenAI exposure. This analysis's 20.9% (score ≥7) measures calibrated theoretical exposure to AI broadly — not directly comparable to any single OECD figure.
Technical Notes
The methodology document (v18) contains 39 technical notes grouped by topic: occupational inventory (notes 1–2), employment data (3–7), salary data (8–12), AI scoring (13–17, 27, 30), EU AI Act classification (18–20, 28), visualisation (21–22), limitations (23–26), validation (29, 31–35), and context, comparability & pending dimensions (36–39). Key notes:
Employment: EPA publishes CNO at 1-digit only; 4-digit figures are Census 2021-weighted proportional estimates (note 3–4).
Salary: EES 2023 reference year is 2022; no temporal deflator applied (note 12). Self-employed excluded by design (note 11).
Scoring: Single-pass LLM generation; estimated reproducibility ±0.5 points (note 27). Few-shot calibration anchors: score 1 (bricklayer), score 5 (driver), score 9 (telemarketer) (note 30).
EU AI Act: Art. 5 prohibits certain AI practices, not professions; no "prohibited" category at occupation level (note 28). Art. 6(3) exemptions not captured (note 20).
The Anthropic Economic Index (February 2026) theory-practice gap suggests calibration factors may be insufficient to capture the full adoption gap (note 32).
Economic viability: Acemoglu (2024) estimates ~4.6% of tasks automatable with positive ROI; Spanish wage levels likely raise this threshold further (note 36).
AIOE index (Felten et al. 2021): Direct comparison not possible due to taxonomy differences (ONET skills vs CNO-11 occupations); a CNO-11 → ISCO-08 → SOC → ONET crosswalk is identified as a future validation path (note 37).
Occupation-level scoring is a deliberate design choice, not an omission; CNO-11 lacks an O*NET-equivalent task inventory (note 38).
Available but unintegrated dimensions: regional (CCAA), gender, informal economy (18–20% GDP), part-time (~2.5M workers) (note 39).
Keywords
artificial intelligence, labour market, employment, Spain, AI exposure, occupational risk, EU AI Act, CNO-11, EPA, automation, interactive dashboard, inter-model validation, Bland-Altman, treemap, wage-exposure index, AESIA
License
Creative Commons Attribution 4.0 International (CC BY 4.0)
Language
Spanish (dataset, justifications, dashboard UI); English (this description, methodology notes bilingual)
Resource Type
Dataset + Interactive Visualisation + Methodology Document
Related Identifiers
https://alvarodenicolas.com/interactive/empleos-ia/index.html (IsSupplementedBy — interactive dashboard)
Brynjolfsson, E., Mitchell, T. & Rock, D. (2018). "What Can Machines Learn, and What Does It Mean for Occupations and the Economy?" AEA Papers and Proceedings, 108:43-47. (References)
Eloundou, T., Manning, S., Mishkin, P. & Rock, D. (2023). "GPTs are GPTs." OpenAI/UPenn. arXiv:2303.10130v5. (References)
Frey, C. B. & Osborne, M. A. (2017). "The Future of Employment." Technological Forecasting and Social Change, 114:254-280. (References)
Regulation (EU) 2024/1689 (EU AI Act). Annex III, Arts. 5 and 6. (References)
Anthropic. The Anthropic Economic Index. February 2026. (References)
Nedelkoska, L. & Quintini, G. (2018). "Automation, skills use and training." OECD Social, Employment and Migration Working Papers, No. 202. (References)
Acemoglu, D. (2024). "The Simple Macroeconomics of AI." Economic Policy. (References)
Felten, E., Raj, M. & Seamans, R. (2021). "Occupational, industry, and geographic exposure to AI." Strategic Management Journal, 42(12):2195-2217. (References)
提供机构:
Zenodo
创建时间:
2026-03-21



