five

ssunggun2/LG_ConvFin_MCQ

收藏
Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/ssunggun2/LG_ConvFin_MCQ
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en task_categories: - question-answering - text-generation pretty_name: LG ConvFinQA MCQ for Reward Model Training size_categories: - n<1K tags: - finance - reasoning - reward-model - multiple-choice - consistency-verified license: mit --- # LG ConvFinQA MCQ Dataset ## Dataset Description This dataset contains high-quality Multiple-Choice Questions (MCQs) generated from the [ConvFinQA dataset](https://huggingface.co/datasets/AdaptLLM/ConvFinQA) for Reward Model (RM) training. Each question has been: 1. **Generated** with 4 carefully crafted answer choices (correct/incorrect × detailed reasoning/answer only) 2. **Verified** using hybrid consistency checking: - Self-consistency: N=10 samplings with the same model - Multi-model verification: Cross-validation with 3 different models 3. **Filtered** to keep only high-quality samples (combined confidence ≥ 0.5) ## Dataset Statistics - **Total samples**: 98 - **Average self-consistency score**: 0.8224 - **Self-consistency gold match rate**: 0.8673 - **Average multi-model consensus rate**: 0.7653 - **Multi-model consensus count**: 78 - **Average combined confidence**: 0.7996 ## Generation Details - **Generation model**: meta-llama/Meta-Llama-3.1-8B-Instruct - **Self-consistency samples**: 10 per question - **Self-consistency temperature**: 0.7 - **Multi-model verifiers**: ['DragonLLM/Llama-Open-Finance-8B', 'Qwen/Qwen2.5-7B-Instruct', 'gpt-4o-mini'] - **Confidence weights**: Self-consistency=0.6, Multi-model=0.4 - **Generated at**: 20260327_022516 ## Dataset Fields Each sample contains the original ConvFinQA context + generated verification metrics: - `id`: Unique identifier from ConvFinQA - `question`: The financial question - `pre_text`: Text preceding the table (JSON-serialized string) - `table`: The financial table (JSON-serialized string) - `post_text`: Text following the table (JSON-serialized string) - `annotation`: Original dataset annotations and calculation paths (JSON-serialized string to preserve schema stability) - `gold_answer`: The correct answer - `generated_mcq`: The complete generated MCQ dictionary containing all 4 choices and their types (JSON-serialized string) - `reasoning_text`: Detailed reasoning for the correct answer - `self_consistency_agreement_rate`: Agreement rate across N samplings (0-1) - `self_consistency_gold_match`: Whether majority answer matches gold answer - `self_consistency_majority_answer`: Most common answer from N samplings - `multi_model_consensus`: Whether majority of models agreed - `multi_model_consensus_rate`: Agreement rate across models (0-1) - `combined_confidence`: Weighted average of self-consistency and multi_model scores ## Use Cases This dataset is designed for: 1. **Reward Model Training**: Train models to distinguish high-quality from low-quality reasoning 2. **Preference Learning**: Learn to rank answers by reasoning quality 3. **Financial QA Evaluation**: Benchmark financial reasoning capabilities ## Citation If you use this dataset, please cite: ```bibtex @misc{lg_convfin_mcq_2026, title={LG ConvFinQA MCQ Dataset for Reward Model Training}, author={LG AI Research}, year={2026}, howpublished={\url{https://huggingface.co/datasets/ssunggun2/LG_ConvFin_MCQ}} } ``` Original ConvFinQA dataset: ```bibtex @article{chen2022convfinqa, title={ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering}, author={Chen, Zhiyu and others}, journal={arXiv preprint arXiv:2210.03849}, year={2022} } ``` ## License MIT License
提供机构:
ssunggun2
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作