five

jang1563/ContradictBio-1138

收藏
Hugging Face2026-03-28 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/jang1563/ContradictBio-1138
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-4.0 task_categories: - text-classification language: - en tags: - biology - biomedicine - contradiction-detection - natural-language-inference - evidence-quality - peer-review - cross-paper size_categories: - 1K<n<10K pretty_name: ContradictBio-1138 dataset_info: features: - name: id dtype: string - name: entry_source dtype: string - name: source_pmid dtype: string - name: source_doi dtype: string - name: paper_title dtype: string - name: claim_a dtype: string - name: claim_b dtype: string - name: is_genuine_contradiction dtype: bool - name: contradiction_type dtype: string - name: confidence dtype: float64 - name: rationale dtype: string - name: abstract_text dtype: string - name: confidence_tier dtype: int64 - name: source_pmid_b dtype: string - name: source_doi_b dtype: string - name: paper_title_b dtype: string - name: abstract_text_b dtype: string splits: - name: train num_examples: 1138 --- # ContradictBio-1138 A biomedical contradiction detection corpus combining within-abstract and cross-paper claim pairs, with 5-category taxonomy and multi-model cross-validation. ## Overview ContradictBio-1138 contains 1,138 biomedical claim pairs labeled for contradiction detection: **338 within-abstract pairs** (from [ContradictBio-338](https://huggingface.co/datasets/jang1563/ContradictBio-338)) plus **800 cross-paper pairs** comparing claims across different publications. Each entry is classified as either **genuine contradiction** or **contextual (non-contradiction)**, with genuine contradictions categorized into 4 types. This corpus was developed as part of [BioTeam-AI](https://github.com/jang1563/bioteam-ai), a multi-agent research automation system for biology. ## Corpus Composition | Source | Entries | Genuine | Contextual | Panel Validated | |--------|---------|---------|------------|-----------------| | **Within-abstract** (v3) | 338 | 123 | 215 | Yes (6-rater PoLL) | | **Cross-paper** (v4) | 800 | 97 | 703 | No (tier 0) | | **Total** | 1,138 | 220 | 918 | | ## Contradiction Taxonomy (5 categories) | Type | Within-abstract | Cross-paper | Total | Description | |------|-----------------|-------------|-------|-------------| | **direct** | 31 | 60 | 91 | Explicit factual disagreement between claims | | **temporal** | 24 | 20 | 44 | Findings that changed over time or across study periods | | **magnitude** | 23 | 1 | 24 | Quantitative disagreement (effect sizes, measurements) | | **methodological** | 45 | 15 | 60 | Contradictions arising from different experimental approaches | | **contextual** | 215 | 704 | 919 | Apparent contradictions explained by differing conditions (negative class) | ## Quality Validation ### Within-Abstract Entries (v3, 338 pairs) Validated using a Panel of LLM Evaluators (PoLL) method ([Verga et al. 2024](https://arxiv.org/abs/2404.18796)) with 3 models x 2 prompt strategies: | Model | Prompt | Precision | Recall | F1 | Parse Fail% | |-------|--------|-----------|--------|-----|-------------| | Gemini 2.5 Flash | baseline | 0.619 | 0.645 | 0.632 | 0% | | DeepSeek V3.2 | contrastive | 0.593 | 0.854 | **0.700** | 0% | | Llama 4 Scout | contrastive | 0.599 | 0.932 | **0.729** | 32.5% | **Key finding**: Contrastive prompt design drives recall from 0.16-0.65 to 0.85-0.97 across all model families. ### Tiered Confidence Labels | Tier | Criteria | Entries | Gold Match | Source | |------|----------|---------|------------|--------| | **Tier 0** | Unrated | 800 | N/A | Cross-paper (v4) | | **Tier 1** | >= 5/6 raters agree | ~132 | **94.2%** | Within-abstract (v3) | | **Tier 2** | 4/6 raters agree | ~111 | 82.0% | Within-abstract (v3) | | **Tier 3** | Split / few agree | ~95 | Needs review | Within-abstract (v3) | ## Data Format Each entry in the JSONL file contains: ```json { "id": "V4-CROSS-DIR-0010", "entry_source": "cross_paper", "source_pmid": "38749421", "source_doi": "10.1016/j.molcel.2024.04.017", "paper_title": "Micronuclei induced by radiation...", "claim_a": "Recent studies have suggested that micronuclei...", "claim_b": "The role of THEM4 in Akt signaling...", "is_genuine_contradiction": true, "contradiction_type": "direct", "confidence": 0.95, "rationale": "The claims address different biological systems...", "abstract_text": "Full abstract of paper A from PubMed...", "confidence_tier": 0, "source_pmid_b": "39011675", "source_doi_b": "10.1631/jzus.B2300457", "paper_title_b": "Roles of THEM4 in the Akt pathway...", "abstract_text_b": "Full abstract of paper B from PubMed..." } ``` ### Fields | Field | Type | Description | |-------|------|-------------| | `id` | string | Unique identifier (`V3-{TYPE}-{NNN}` for within-abstract, `V4-CROSS-{TYPE}-{NNN}` for cross-paper) | | `entry_source` | string | `"within_abstract"` or `"cross_paper"` | | `source_pmid` | string | PubMed ID of the first (or only) source paper | | `source_doi` | string | DOI of the first source paper | | `paper_title` | string | Title of the first source paper | | `claim_a` | string | First extracted claim | | `claim_b` | string | Second claim that may contradict `claim_a` | | `is_genuine_contradiction` | bool | `true` = genuine contradiction, `false` = contextual | | `contradiction_type` | string | One of: `direct`, `temporal`, `magnitude`, `methodological`, `contextual` | | `confidence` | float | Annotation confidence score (0.0-1.0) | | `rationale` | string | Explanation of why the pair is/isn't a contradiction | | `abstract_text` | string | Full abstract text of paper A from PubMed | | `confidence_tier` | int | 0 = unrated (v4), 1 = high (>=5/6 agree), 2 = medium (4/6), 3 = uncertain (<=3/6) | | `source_pmid_b` | string | PubMed ID of the second paper (cross-paper only; empty for within-abstract) | | `source_doi_b` | string | DOI of the second paper (cross-paper only) | | `paper_title_b` | string | Title of the second paper (cross-paper only) | | `abstract_text_b` | string | Full abstract of paper B (cross-paper only) | ## Usage ```python from datasets import load_dataset dataset = load_dataset("jang1563/ContradictBio-1138") # Use only panel-validated within-abstract pairs (Tier 1 = highest quality) tier1 = dataset["train"].filter(lambda x: x["confidence_tier"] == 1) print(f"Tier 1 (validated): {len(tier1)} entries") # Use only cross-paper pairs cross = dataset["train"].filter(lambda x: x["entry_source"] == "cross_paper") print(f"Cross-paper pairs: {len(cross)}") # Filter genuine contradictions genuine = dataset["train"].filter(lambda x: x["is_genuine_contradiction"]) print(f"Genuine contradictions: {len(genuine)}") # Access both abstracts for cross-paper entries for ex in cross.select(range(3)): print(f"[{ex['contradiction_type']}] Paper A: {ex['paper_title'][:60]}...") print(f" vs Paper B: {ex['paper_title_b'][:60]}...") ``` ## Intended Use - **Benchmarking** contradiction detection systems (within-abstract and cross-paper) - **Training** classifiers for biomedical claim pair analysis - **Evaluating** prompt strategies for scientific claim analysis - **Research** on evidence quality, scientific disagreement, and literature consistency ## Related Datasets - [ContradictBio-338](https://huggingface.co/datasets/jang1563/ContradictBio-338) — the within-abstract subset with full 6-rater cross-validation details ## Limitations - Cross-paper entries (800) have not undergone multi-rater panel validation (confidence_tier = 0) - Gold labels created by a single annotator with LLM-assisted cross-validation for within-abstract subset only - Cross-paper pairs may include claims from unrelated biological domains - 1 cross-paper entry (V4-CROSS-DIR-0138) has `is_genuine_contradiction=true` but `contradiction_type=contextual` due to a labeling inconsistency in the v4 generation pipeline - Some within-abstract entries use non-standard ID format (`V3-MET-C{NNN}`) - 8/338 within-abstract entries have empty `source_doi`; all entries are fully citable via `source_pmid` ## Citation ```bibtex @software{kim2026bioteamai, title = {BioTeam-AI: Personal AI Science Team for Biology Research}, author = {Kim, JangKeun}, year = {2026}, url = {https://github.com/jang1563/bioteam-ai}, license = {MIT} } ``` ## License [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/) — Attribution-NonCommercial 4.0 International. **You are free to:** - Use, share, and adapt this dataset for **research, education, and non-profit purposes** - Cite this work in academic publications **You may NOT:** - Use this dataset for **commercial purposes** without explicit written permission from the author For commercial licensing inquiries, contact the author via the [BioTeam-AI repository](https://github.com/jang1563/bioteam-ai).
提供机构:
jang1563
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作