defenseunicorns/LFAI_RAG_qa_v1

Name: defenseunicorns/LFAI_RAG_qa_v1
Creator: defenseunicorns
Published: 2024-09-05 21:05:50
License: 暂无描述

Hugging Face2024-09-05 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/defenseunicorns/LFAI_RAG_qa_v1

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: apache-2.0 configs: - config_name: LFAI_RAG_qa_v1 data_files: - split: eval path: LFAI_RAG_qa_v1.json default: true --- # LFAI_RAG_qa_v1 This dataset aims to be the basis for RAG-focused question and answer evaluations for [LeapfrogAI](https://github.com/defenseunicorns/leapfrogai)🐸. ## Dataset Details LFAI_RAG_qa_v1 contains 36 question/answer/context entries that are intended to be used for LLM-as-a-judge enabled RAG Evaluations. Example: ``` { "input": "What requirement must be met to run VPI PVA algorithms in a Docker container?", "actual_output": null, "expected_output": "To run VPI PVA algorithms in a Docker container, the same VPI version must be installed on the Docker host.", "context": [ "2.6.\nCompute\nStack\nThe\nfollowing\nDeep\nLearning-related\nissues\nare\nnoted\nin\nthis\nrelease.\nIssue\nDescription\n4564075\nTo\nrun\nVPI\nPVA\nalgorithms\nin\na\ndocker\ncontainer,\nthe\nsame\nVPI\nversion\nhas\nto\nbe\ninstalled\non \nthe\ndocker\nhost.\n2.7.\nDeepstream\nIssue\nDescription\n4325898\nThe\npipeline\ngets\nstuck\nfor\nmulti\u0000lesrc\nwhen\nusing\nnvv4l2decoder.\nDS\ndevelopers\nuse \nthe\npipeline\nto\nrun\ndecode\nand\ninfer\njpeg\nimages.\nNVIDIA\nJetson\nLinux\nRelease\nNotes\nRN_10698-r36.3\n|\n11" ], "source_file": "documents/Jetson_Linux_Release_Notes_r36.3.pdf" } ``` ### Dataset Sources Data was generated from the following sources: ``` https://www.humanesociety.org/sites/default/files/docs/HSUS_ACFS-2023.pdf https://www.whitehouse.gov/wp-content/uploads/2024/04/Global-Health-Security-Strategy-2024-1.pdf https://www.armed-services.senate.gov/imo/media/doc/fy24_ndaa_conference_executive_summary1.pdf https://dodcio.defense.gov/Portals/0/Documents/Library/(U)%202024-01-02%20DoD%20Cybersecurity%20Reciprocity%20Playbook.pdf https://assets.ctfassets.net/oggad6svuzkv/2pIQQWQXPpxiKjjmhfpyWf/eb17b3f3c9c21f7abb05e68c7b1f3fcd/2023_annual_report.pdf https://www.toyota.com/content/dam/toyota/brochures/pdf/2024/T-MMS-24Corolla.pdf https://docs.nvidia.com/jetson/archives/r36.3/ReleaseNotes/Jetson_Linux_Release_Notes_r36.3.pdf https://arxiv.org/pdf/2406.05370.pdf ``` The documents themselves can be found in [document_context.zip](https://huggingface.co/datasets/jalling/LFAI_RAG_qa_v1/raw/main/document_context.zip). ## Uses This dataset is ready to be used for LLM-as-a-judge evaluations, formatted specifically for compatibility with [DeepEval](https://github.com/confident-ai/deepeval). ## Dataset Structure  This dataset follows the format for Test Case [Goldens](https://docs.confident-ai.com/docs/confident-ai-manage-datasets#what-is-a-golden) in DeepEval. Each entry in this dataset contains the following fields: - `input`, the question to be prompted to your LLM - `expected_output`, the ground truth answer to the prompted question - `context`, the ground truth source in documentation that contains or informs the ground truth answer ## Dataset Creation This dataset was generated from the source documentation using DeepEval's [Synthesizer](https://docs.confident-ai.com/docs/evaluation-datasets-synthetic-data). The dataset was then refined by: - Removing entries with poorly formatted or too simplistic questions - Removing entries with question/answer pairs that did not make sense in context - Modifying questions to reduce verbosity and increase factual accuracy ## Bias, Risks, and Limitations  This dataset was generated using GPT-4o, and therefore carries along the bias of the model as well as the human annotator who refined it. The dataset was created with the intention of using source documentation that is unlikely to be in the training data of any current models, but this will likely change within the coming months as new models are released. ## Dataset Card Authors The Leapfrogai🐸 team at [Defense Unicorns](https://www.defenseunicorns.com/)🦄 ## Dataset Card Contact - ai@defenseunicorns.com

提供机构：

defenseunicorns

5,000+

优质数据集

54 个

任务类型

进入经典数据集