massines3a/aligned-safety-data
收藏Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/massines3a/aligned-safety-data
下载链接
链接失效反馈官方服务:
资源简介:
# Aligned Safety Data for AI Midtraining
A curated corpus of 10,000 educational documents discussing harmful topics (racism, nazism, sexism, etc.) from analytical, historical, and educational perspectives. Designed for continuous pretraining (midtraining) of language models to embed structural alignment.
## Dataset Description
This dataset contains unstructured, natural prose extracted from educational web content. Documents explicitly condemn harmful ideologies or analyze them through the lens of history, sociology, and ethics.
### Key Features
- **10,000 documents** with balanced distribution across 15 target topics
- **Educational quality baseline**: All documents have fineweb-edu score ≥ 3
- **Tiered anchor filtering**: Requires explicit condemnation language
- **Topic primacy check**: Target terms must be primary topics, not passing mentions
## Target Topics
| Category | Topics |
|----------|--------|
| Core Ideologies | racism, nazism, sexism, misogyny, antisemitism, white supremacy |
| Historical Terms | holocaust, segregation, hitler, nazi, fascism, apartheid |
| Modern Concepts | ethnic cleansing, genocide, hate speech |
## Distribution
| Primary Target | Count | | Primary Target | Count |
|----------------|-------|-|----------------|-------|
| racism | 700 | | nazi | 700 |
| sexism | 700 | | fascism | 700 |
| antisemitism | 700 | | apartheid | 700 |
| white supremacy | 700 | | ethnic cleansing | 700 |
| holocaust | 700 | | genocide | 700 |
| segregation | 700 | | hate speech | 700 |
| hitler | 700 | | misogyny | 556 |
| nazism | 344 | | | |
## Filtering Methodology
### Strong Anchors (Condemnation Language)
Documents must contain at least ONE of:
- condemned, atrocities, victims, persecution, injustice
- hate crime, unethical, tragic, devastating impact
- fought against, eradicate, horrific, crimes against humanity, oppression
### Weak Anchors (Educational Context)
Documents must contain at least TWO total anchors including:
- civil rights, human rights, nuremberg, legislation
- discrimination, systemic, prejudice, marginalized
- historical, movement, abolished, liberation
- equality, justice, memorial, remembrance
## Data Schema
```json
{
"text": "Full document text...",
"url": "Source URL",
"score": 3.5,
"matched_targets": ["holocaust", "nazi"],
"primary_target": "holocaust",
"matched_strong_anchors": ["victims", "persecution"],
"matched_weak_anchors": ["historical", "discrimination"]
}
```
## Files
| File | Description |
|------|-------------|
| `aligned_docs_sample.jsonl` | 10,000 documents in JSON Lines format |
| `aligned_docs_sample.parquet` | Same data in Parquet format for efficient loading |
| `retrieval_stats.json` | Detailed retrieval statistics |
## Usage
```python
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("massines3a/aligned-safety-data")
# Access documents
for doc in dataset["train"]:
print(doc["primary_target"], doc["text"][:200])
```
## Source
Extracted from [HuggingFaceFW/fineweb-edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) using co-occurrence filtering with educational quality thresholds.
## Statistics
- **Documents scanned**: 92.4 million
- **Documents retrieved**: 10,000
- **Retrieval rate**: 0.01%
- **Top strong anchors**: victims (3,939), oppression (2,435), persecution (2,190)
- **Top weak anchors**: movement (5,466), justice (4,593), historical (4,592)
## Intended Use
This dataset is intended for:
- Midtraining/continuous pretraining of language models
- AI safety alignment research
- Educational content analysis
**Not intended for**: Fine-tuning conversational models, generating harmful content.
## License
This dataset inherits the license from the source dataset (fineweb-edu). Please refer to the original dataset for licensing terms.
提供机构:
massines3a



