five

soketlabs/SARTHI-AgriData

收藏
Hugging Face2026-01-16 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/soketlabs/SARTHI-AgriData
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含高质量的、经过生物学验证的农业建议,旨在为印度农业环境训练大型语言模型(LLMs)。与标准合成数据集不同,该数据通过多阶段流程生成,严格将科学验证与内容生成分开,确保模型仅尝试为已被科学证明可行的场景编写建议。关键特征包括:预验证场景、思维链(CoT)、严格的安全防护措施和多样化角色。数据集以JSONL和Parquet格式存储,每条记录包含唯一ID、输入提示、系统指令、模型思考过程和最终建议。

This dataset contains high-quality, biologically validated agricultural advisories designed to train Large Language Models (LLMs) for the Indian agricultural context. Unlike standard synthetic datasets, this data was generated using a Multi-Stage Pipeline that strictly separates scientific validation from content generation, ensuring that the model only attempts to write advisories for scenarios that have already been proven scientifically plausible. Key features include Pre-Validated Scenarios, Chain-of-Thought (CoT), Strict Safety Guardrails, and Diverse Personas. The dataset is stored in JSONL and Parquet formats, with each record containing a unique ID, input prompt, system instruction, model thoughts, and final advisory.
提供机构:
soketlabs
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作