jang1563/ContradictBio-1138
收藏Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/jang1563/ContradictBio-1138
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- text-classification
language:
- en
tags:
- biology
- biomedicine
- contradiction-detection
- natural-language-inference
- evidence-quality
- peer-review
- cross-paper
size_categories:
- 1K<n<10K
pretty_name: ContradictBio-1138
dataset_info:
features:
- name: id
dtype: string
- name: entry_source
dtype: string
- name: source_pmid
dtype: string
- name: source_doi
dtype: string
- name: paper_title
dtype: string
- name: claim_a
dtype: string
- name: claim_b
dtype: string
- name: is_genuine_contradiction
dtype: bool
- name: contradiction_type
dtype: string
- name: confidence
dtype: float64
- name: rationale
dtype: string
- name: abstract_text
dtype: string
- name: confidence_tier
dtype: int64
- name: source_pmid_b
dtype: string
- name: source_doi_b
dtype: string
- name: paper_title_b
dtype: string
- name: abstract_text_b
dtype: string
splits:
- name: train
num_examples: 1138
---
# ContradictBio-1138
A biomedical contradiction detection corpus combining within-abstract and cross-paper claim pairs, with 5-category taxonomy and multi-model cross-validation.
## Overview
ContradictBio-1138 contains 1,138 biomedical claim pairs labeled for contradiction detection: **338 within-abstract pairs** (from [ContradictBio-338](https://huggingface.co/datasets/jang1563/ContradictBio-338)) plus **800 cross-paper pairs** comparing claims across different publications.
Each entry is classified as either **genuine contradiction** or **contextual (non-contradiction)**, with genuine contradictions categorized into 4 types.
This corpus was developed as part of [BioTeam-AI](https://github.com/jang1563/bioteam-ai), a multi-agent research automation system for biology.
## Corpus Composition
| Source | Entries | Genuine | Contextual | Panel Validated |
|--------|---------|---------|------------|-----------------|
| **Within-abstract** (v3) | 338 | 123 | 215 | Yes (6-rater PoLL) |
| **Cross-paper** (v4) | 800 | 97 | 703 | No (tier 0) |
| **Total** | 1,138 | 220 | 918 | |
## Contradiction Taxonomy (5 categories)
| Type | Within-abstract | Cross-paper | Total | Description |
|------|-----------------|-------------|-------|-------------|
| **direct** | 31 | 60 | 91 | Explicit factual disagreement between claims |
| **temporal** | 24 | 20 | 44 | Findings that changed over time or across study periods |
| **magnitude** | 23 | 1 | 24 | Quantitative disagreement (effect sizes, measurements) |
| **methodological** | 45 | 15 | 60 | Contradictions arising from different experimental approaches |
| **contextual** | 215 | 704 | 919 | Apparent contradictions explained by differing conditions (negative class) |
## Quality Validation
### Within-Abstract Entries (v3, 338 pairs)
Validated using a Panel of LLM Evaluators (PoLL) method ([Verga et al. 2024](https://arxiv.org/abs/2404.18796)) with 3 models x 2 prompt strategies:
| Model | Prompt | Precision | Recall | F1 | Parse Fail% |
|-------|--------|-----------|--------|-----|-------------|
| Gemini 2.5 Flash | baseline | 0.619 | 0.645 | 0.632 | 0% |
| DeepSeek V3.2 | contrastive | 0.593 | 0.854 | **0.700** | 0% |
| Llama 4 Scout | contrastive | 0.599 | 0.932 | **0.729** | 32.5% |
**Key finding**: Contrastive prompt design drives recall from 0.16-0.65 to 0.85-0.97 across all model families.
### Tiered Confidence Labels
| Tier | Criteria | Entries | Gold Match | Source |
|------|----------|---------|------------|--------|
| **Tier 0** | Unrated | 800 | N/A | Cross-paper (v4) |
| **Tier 1** | >= 5/6 raters agree | ~132 | **94.2%** | Within-abstract (v3) |
| **Tier 2** | 4/6 raters agree | ~111 | 82.0% | Within-abstract (v3) |
| **Tier 3** | Split / few agree | ~95 | Needs review | Within-abstract (v3) |
## Data Format
Each entry in the JSONL file contains:
```json
{
"id": "V4-CROSS-DIR-0010",
"entry_source": "cross_paper",
"source_pmid": "38749421",
"source_doi": "10.1016/j.molcel.2024.04.017",
"paper_title": "Micronuclei induced by radiation...",
"claim_a": "Recent studies have suggested that micronuclei...",
"claim_b": "The role of THEM4 in Akt signaling...",
"is_genuine_contradiction": true,
"contradiction_type": "direct",
"confidence": 0.95,
"rationale": "The claims address different biological systems...",
"abstract_text": "Full abstract of paper A from PubMed...",
"confidence_tier": 0,
"source_pmid_b": "39011675",
"source_doi_b": "10.1631/jzus.B2300457",
"paper_title_b": "Roles of THEM4 in the Akt pathway...",
"abstract_text_b": "Full abstract of paper B from PubMed..."
}
```
### Fields
| Field | Type | Description |
|-------|------|-------------|
| `id` | string | Unique identifier (`V3-{TYPE}-{NNN}` for within-abstract, `V4-CROSS-{TYPE}-{NNN}` for cross-paper) |
| `entry_source` | string | `"within_abstract"` or `"cross_paper"` |
| `source_pmid` | string | PubMed ID of the first (or only) source paper |
| `source_doi` | string | DOI of the first source paper |
| `paper_title` | string | Title of the first source paper |
| `claim_a` | string | First extracted claim |
| `claim_b` | string | Second claim that may contradict `claim_a` |
| `is_genuine_contradiction` | bool | `true` = genuine contradiction, `false` = contextual |
| `contradiction_type` | string | One of: `direct`, `temporal`, `magnitude`, `methodological`, `contextual` |
| `confidence` | float | Annotation confidence score (0.0-1.0) |
| `rationale` | string | Explanation of why the pair is/isn't a contradiction |
| `abstract_text` | string | Full abstract text of paper A from PubMed |
| `confidence_tier` | int | 0 = unrated (v4), 1 = high (>=5/6 agree), 2 = medium (4/6), 3 = uncertain (<=3/6) |
| `source_pmid_b` | string | PubMed ID of the second paper (cross-paper only; empty for within-abstract) |
| `source_doi_b` | string | DOI of the second paper (cross-paper only) |
| `paper_title_b` | string | Title of the second paper (cross-paper only) |
| `abstract_text_b` | string | Full abstract of paper B (cross-paper only) |
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("jang1563/ContradictBio-1138")
# Use only panel-validated within-abstract pairs (Tier 1 = highest quality)
tier1 = dataset["train"].filter(lambda x: x["confidence_tier"] == 1)
print(f"Tier 1 (validated): {len(tier1)} entries")
# Use only cross-paper pairs
cross = dataset["train"].filter(lambda x: x["entry_source"] == "cross_paper")
print(f"Cross-paper pairs: {len(cross)}")
# Filter genuine contradictions
genuine = dataset["train"].filter(lambda x: x["is_genuine_contradiction"])
print(f"Genuine contradictions: {len(genuine)}")
# Access both abstracts for cross-paper entries
for ex in cross.select(range(3)):
print(f"[{ex['contradiction_type']}] Paper A: {ex['paper_title'][:60]}...")
print(f" vs Paper B: {ex['paper_title_b'][:60]}...")
```
## Intended Use
- **Benchmarking** contradiction detection systems (within-abstract and cross-paper)
- **Training** classifiers for biomedical claim pair analysis
- **Evaluating** prompt strategies for scientific claim analysis
- **Research** on evidence quality, scientific disagreement, and literature consistency
## Related Datasets
- [ContradictBio-338](https://huggingface.co/datasets/jang1563/ContradictBio-338) — the within-abstract subset with full 6-rater cross-validation details
## Limitations
- Cross-paper entries (800) have not undergone multi-rater panel validation (confidence_tier = 0)
- Gold labels created by a single annotator with LLM-assisted cross-validation for within-abstract subset only
- Cross-paper pairs may include claims from unrelated biological domains
- 1 cross-paper entry (V4-CROSS-DIR-0138) has `is_genuine_contradiction=true` but `contradiction_type=contextual` due to a labeling inconsistency in the v4 generation pipeline
- Some within-abstract entries use non-standard ID format (`V3-MET-C{NNN}`)
- 8/338 within-abstract entries have empty `source_doi`; all entries are fully citable via `source_pmid`
## Citation
```bibtex
@software{kim2026bioteamai,
title = {BioTeam-AI: Personal AI Science Team for Biology Research},
author = {Kim, JangKeun},
year = {2026},
url = {https://github.com/jang1563/bioteam-ai},
license = {MIT}
}
```
## License
[CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) — Attribution-NonCommercial 4.0 International.
**You are free to:**
- Use, share, and adapt this dataset for **research, education, and non-profit purposes**
- Cite this work in academic publications
**You may NOT:**
- Use this dataset for **commercial purposes** without explicit written permission from the author
For commercial licensing inquiries, contact the author via the [BioTeam-AI repository](https://github.com/jang1563/bioteam-ai).
提供机构:
jang1563



