five

Algorithmic Extraction and Mathematical Model Integration for Machine Learning Workflows (V12)

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://doi.org/10.7910/DVN/KAOQAG
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset provides an auditable, full-coverage, and lossless multi-level knowledge graph derived from a structured machine-learning methodology document (V12). The graph preserves 100% of the source document’s atomic units—paragraph text units and table cells (including empty units)—to support deterministic coverage auditing, traceability, and reproducible governance workflows. It is delivered as a CSV-formatted graph with explicit `NODE` and `EDGE` records, including hierarchical levels (document → sections → paragraphs/tables → rows → atomic text units) and derived semantic layers (formulas, rules/procedures, algorithm modules, and tokens). Each atomic node is indexed (e.g., `paragraph_index`, `table_index`, `row_index`, `col_index`) to enable exact structural reconstruction and machine-verifiable integrity checks. Beyond documentation preservation, the dataset is directly usable for machine-learning solution design and machine-learning research auditing. For solution design, the graph functions as a computable blueprint of ML workflow components (objective functions, diagnostics, threshold calibration, uncertainty quantification, MRB reproducibility fields, robustness/fairness evidence) that can be queried and assembled into implementation-ready checklists or pipeline specifications. For research auditing, it supports evidence-based review of methodological completeness and compliance (e.g., reproducibility package requirements, deterministic hashing rules, stable sorting, adaptive floating-point quantization, environment locking), enabling structured verification of whether a reported ML study meets auditable standards. The dataset is suitable for ML engineering teams, research evaluators, governance/compliance units, and computational social science or AI governance researchers, and can be imported into graph databases or processed in Python/R for automated checks, benchmarking, and reporting.
创建时间:
2026-02-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作