massines3a/aligned-safety-data

Name: massines3a/aligned-safety-data
Creator: massines3a
Published: 2026-04-16 09:27:21
License: 暂无描述

Hugging Face2026-04-16 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/massines3a/aligned-safety-data

下载链接

链接失效反馈

官方服务：

资源简介：

# Aligned Safety Data for AI Midtraining A curated corpus of 10,000 educational documents discussing harmful topics (racism, nazism, sexism, etc.) from analytical, historical, and educational perspectives. Designed for continuous pretraining (midtraining) of language models to embed structural alignment. ## Dataset Description This dataset contains unstructured, natural prose extracted from educational web content. Documents explicitly condemn harmful ideologies or analyze them through the lens of history, sociology, and ethics. ### Key Features - **10,000 documents** with balanced distribution across 15 target topics - **Educational quality baseline**: All documents have fineweb-edu score ≥ 3 - **Tiered anchor filtering**: Requires explicit condemnation language - **Topic primacy check**: Target terms must be primary topics, not passing mentions ## Target Topics | Category | Topics | |----------|--------| | Core Ideologies | racism, nazism, sexism, misogyny, antisemitism, white supremacy | | Historical Terms | holocaust, segregation, hitler, nazi, fascism, apartheid | | Modern Concepts | ethnic cleansing, genocide, hate speech | ## Distribution | Primary Target | Count | | Primary Target | Count | |----------------|-------|-|----------------|-------| | racism | 700 | | nazi | 700 | | sexism | 700 | | fascism | 700 | | antisemitism | 700 | | apartheid | 700 | | white supremacy | 700 | | ethnic cleansing | 700 | | holocaust | 700 | | genocide | 700 | | segregation | 700 | | hate speech | 700 | | hitler | 700 | | misogyny | 556 | | nazism | 344 | | | | ## Filtering Methodology ### Strong Anchors (Condemnation Language) Documents must contain at least ONE of: - condemned, atrocities, victims, persecution, injustice - hate crime, unethical, tragic, devastating impact - fought against, eradicate, horrific, crimes against humanity, oppression ### Weak Anchors (Educational Context) Documents must contain at least TWO total anchors including: - civil rights, human rights, nuremberg, legislation - discrimination, systemic, prejudice, marginalized - historical, movement, abolished, liberation - equality, justice, memorial, remembrance ## Data Schema ```json { "text": "Full document text...", "url": "Source URL", "score": 3.5, "matched_targets": ["holocaust", "nazi"], "primary_target": "holocaust", "matched_strong_anchors": ["victims", "persecution"], "matched_weak_anchors": ["historical", "discrimination"] } ``` ## Files | File | Description | |------|-------------| | `aligned_docs_sample.jsonl` | 10,000 documents in JSON Lines format | | `aligned_docs_sample.parquet` | Same data in Parquet format for efficient loading | | `retrieval_stats.json` | Detailed retrieval statistics | ## Usage ```python from datasets import load_dataset # Load the dataset dataset = load_dataset("massines3a/aligned-safety-data") # Access documents for doc in dataset["train"]: print(doc["primary_target"], doc["text"][:200]) ``` ## Source Extracted from [HuggingFaceFW/fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) using co-occurrence filtering with educational quality thresholds. ## Statistics - **Documents scanned**: 92.4 million - **Documents retrieved**: 10,000 - **Retrieval rate**: 0.01% - **Top strong anchors**: victims (3,939), oppression (2,435), persecution (2,190) - **Top weak anchors**: movement (5,466), justice (4,593), historical (4,592) ## Intended Use This dataset is intended for: - Midtraining/continuous pretraining of language models - AI safety alignment research - Educational content analysis **Not intended for**: Fine-tuning conversational models, generating harmful content. ## License This dataset inherits the license from the source dataset (fineweb-edu). Please refer to the original dataset for licensing terms.

提供机构：

massines3a

5,000+

优质数据集

54 个

任务类型

进入经典数据集