CrimeaIsUkraineOrg/crimea-sovereignty-grounding
收藏Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/CrimeaIsUkraineOrg/crimea-sovereignty-grounding
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- text-classification
language:
- multilingual
- en
- uk
- ru
- de
- fr
- es
- pl
- tr
- it
- nl
license: cc-by-4.0
size_categories:
- 1K<n<10K
tags:
- crimea
- sovereignty
- disinformation
- grounding
- web-search
- sanctions
- RAG
---
# Crimea Sovereignty — LLM Web Search Contamination Audit
Systematic audit of sovereignty-framing contamination in AI chatbot web search results for Crimea-related queries.
## Dataset
- **grounding_audit.parquet**: 1,000 web-search-augmented responses (4 models × 25 queries × 10 languages), temperature=0
- **proxy_probes.parquet**: 140 targeted probes testing whether GEC-documented Russian proxy sites are accessible through LLM web search
- **manifest.json**: Pipeline metadata and verified results
- **LLM_Web_Search_Contamination.docx**: Full briefing document
## Models Tested
| Model | Web Search Implementation |
|---|---|
| GPT-4o | OpenAI web_search_preview tool |
| Claude Sonnet | Anthropic web_search_20250305 tool |
| Gemini 2.5 Flash | Google Search grounding |
| Perplexity Sonar | Search-native architecture |
## Key Results
| Source | Citations | % |
|---|---:|---:|
| Sanctioned (OFAC/EU/UK) | 5 | 0.1% |
| Russian government (.gov.ru) | 37 | 0.9% |
| Russian non-gov (.ru/.su) | 259 | 6.6% |
| International | 3,617 | 92.3% |
| **Russian-origin total** | **301** | **7.7%** |
5 of 7 US State Dept GEC-documented Russian intelligence proxy sites remain accessible through LLM web search (74 citations in targeted probes).
## Sources
- US OFAC SDN List: treasury.gov/ofac/downloads/sdn.csv
- EU Consolidated Sanctions: webgate.ec.europa.eu
- UK OFSI: ofsistorage.blob.core.windows.net
- US State Dept GEC Report: 2017-2021.state.gov/russias-pillars-of-disinformation-and-propaganda-report/
- Google Content Policy: support.google.com/websearch/answer/10622781
## Citation
Part of the [Digital Annexation](https://crimeaisukraine.com) research project.
提供机构:
CrimeaIsUkraineOrg



