five

MinaGabriel/sentence-relevance-extractor

收藏
Hugging Face2025-11-20 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/MinaGabriel/sentence-relevance-extractor
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: Sentence Relevance Extractor (SRE) license: mit language: - en --- # Sentence Relevance Extractor (SRE) **Sentence Relevance Extractor (SRE)** is a large-scale dataset for **binary evidence selection** in multi-document, multi-hop question answering. The goal: > Given a question and a sentence from the context, predict whether this sentence is **relevant evidence** ("Yes") or **irrelevant** ("No"). This dataset is suitable for training: - **Sentence-level RAG rerankers** - **Binary relevance classifiers** - **Optimization-based truth discovery systems** - **Multi-hop QA evidence selectors** --- ## Dataset Statistics | Split | # Samples | |-------|------------| | **Train** | **1,902,056** | | **Validation** | **211,340** | | **Test** | **141,726** | | **Total** | **2,255,122** | ### Dataset Source Summary - **From HF train splits:** 2,113,396 - **From HF validation/test splits:** 141,726 - After balancing & sampling → final splits above. --- ## Provided Files - `multihop_sentrel_train.jsonl` - `multihop_sentrel_val.jsonl` - `multihop_sentrel_test.jsonl` Each line corresponds to one `(question, sentence)` relevance judgment. --- ## Data Format (JSONL) Each row: ```json { "dataset": "2wikimultihopqa", "source_id": "7f23725...", "question": "Who is the child of the director of Inquilaab (2002 film)?", "full_context": "Inquilaab ... (titles and sentences)", "sentence": "Inquilaab is a 2002 Bengali action thriller film directed by Anup Sengupta.", "label": "Yes", "title": "Inquilaab (2002 film)", "doc_index": 0, "sent_index": 0, "split": "train" }
提供机构:
MinaGabriel
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作