five

defenseunicorns/LFAI_RAG_qa_v1

收藏
Hugging Face2024-09-05 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/defenseunicorns/LFAI_RAG_qa_v1
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: apache-2.0 configs: - config_name: LFAI_RAG_qa_v1 data_files: - split: eval path: LFAI_RAG_qa_v1.json default: true --- # LFAI_RAG_qa_v1 This dataset aims to be the basis for RAG-focused question and answer evaluations for [LeapfrogAI](https://github.com/defenseunicorns/leapfrogai)🐸. ## Dataset Details LFAI_RAG_qa_v1 contains 36 question/answer/context entries that are intended to be used for LLM-as-a-judge enabled RAG Evaluations. Example: ``` { "input": "What requirement must be met to run VPI PVA algorithms in a Docker container?", "actual_output": null, "expected_output": "To run VPI PVA algorithms in a Docker container, the same VPI version must be installed on the Docker host.", "context": [ "2.6.\nCompute\nStack\nThe\nfollowing\nDeep\nLearning-related\nissues\nare\nnoted\nin\nthis\nrelease.\nIssue\nDescription\n4564075\nTo\nrun\nVPI\nPVA\nalgorithms\nin\na\ndocker\ncontainer,\nthe\nsame\nVPI\nversion\nhas\nto\nbe\ninstalled\non \nthe\ndocker\nhost.\n2.7.\nDeepstream\nIssue\nDescription\n4325898\nThe\npipeline\ngets\nstuck\nfor\nmulti\u0000lesrc\nwhen\nusing\nnvv4l2decoder.\nDS\ndevelopers\nuse \nthe\npipeline\nto\nrun\ndecode\nand\ninfer\njpeg\nimages.\nNVIDIA\nJetson\nLinux\nRelease\nNotes\nRN_10698-r36.3\n|\n11" ], "source_file": "documents/Jetson_Linux_Release_Notes_r36.3.pdf" } ``` ### Dataset Sources Data was generated from the following sources: ``` https://www.humanesociety.org/sites/default/files/docs/HSUS_ACFS-2023.pdf https://www.whitehouse.gov/wp-content/uploads/2024/04/Global-Health-Security-Strategy-2024-1.pdf https://www.armed-services.senate.gov/imo/media/doc/fy24_ndaa_conference_executive_summary1.pdf https://dodcio.defense.gov/Portals/0/Documents/Library/(U)%202024-01-02%20DoD%20Cybersecurity%20Reciprocity%20Playbook.pdf https://assets.ctfassets.net/oggad6svuzkv/2pIQQWQXPpxiKjjmhfpyWf/eb17b3f3c9c21f7abb05e68c7b1f3fcd/2023_annual_report.pdf https://www.toyota.com/content/dam/toyota/brochures/pdf/2024/T-MMS-24Corolla.pdf https://docs.nvidia.com/jetson/archives/r36.3/ReleaseNotes/Jetson_Linux_Release_Notes_r36.3.pdf https://arxiv.org/pdf/2406.05370.pdf ``` The documents themselves can be found in [document_context.zip](https://huggingface.co/datasets/jalling/LFAI_RAG_qa_v1/raw/main/document_context.zip). ## Uses This dataset is ready to be used for LLM-as-a-judge evaluations, formatted specifically for compatibility with [DeepEval](https://github.com/confident-ai/deepeval). ## Dataset Structure <!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. --> This dataset follows the format for Test Case [Goldens](https://docs.confident-ai.com/docs/confident-ai-manage-datasets#what-is-a-golden) in DeepEval. Each entry in this dataset contains the following fields: - `input`, the question to be prompted to your LLM - `expected_output`, the ground truth answer to the prompted question - `context`, the ground truth source in documentation that contains or informs the ground truth answer ## Dataset Creation This dataset was generated from the source documentation using DeepEval's [Synthesizer](https://docs.confident-ai.com/docs/evaluation-datasets-synthetic-data). The dataset was then refined by: - Removing entries with poorly formatted or too simplistic questions - Removing entries with question/answer pairs that did not make sense in context - Modifying questions to reduce verbosity and increase factual accuracy ## Bias, Risks, and Limitations <!-- This section is meant to convey both technical and sociotechnical limitations. --> This dataset was generated using GPT-4o, and therefore carries along the bias of the model as well as the human annotator who refined it. The dataset was created with the intention of using source documentation that is unlikely to be in the training data of any current models, but this will likely change within the coming months as new models are released. ## Dataset Card Authors The Leapfrogai🐸 team at [Defense Unicorns](https://www.defenseunicorns.com/)🦄 ## Dataset Card Contact - ai@defenseunicorns.com
提供机构:
defenseunicorns
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作