five

AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

收藏
doi.org2025-03-23 收录
下载链接:
https://doi.org/10.24435/materialscloud:2020.0027/v1
下载链接
链接失效反馈
官方服务:
资源简介:
The ever-growing availability of computing power and sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (http://www.aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDA's workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with any simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible. This archive record contains the data to reproduce the figures on engine performance in the section "Event versus polling-based engine" of the paper entitled "AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance". It also includes instructions to reproduce the actual data from scratch using AiiDA v1.1.1 and AiiDA v0.12.5.

随着计算能力的持续增长以及先进计算方法的不断进步,科学领域近年来取得了显著进展。这些进步带来的新挑战,源于处理庞大计算量和数据量的需求。下一代百亿亿次级超级计算机将进一步加剧这些挑战,使得自动化和可扩展的解决方案变得至关重要。近年来,我们致力于开发 AiiDA(http://www.aiida.net),这是一个功能强大的开源高吞吐量基础设施,旨在应对自动化工作流程管理和数据溯源记录所带来的挑战。在此,我们介绍了实现持续性能所必需的发展与能力,AiiDA 支持的吞吐量可达每小时数万个进程,同时能够自动保存并存储完整的数据溯源信息至关系型数据库中,使其可查询和遍历,从而实现高性能数据分析。AiiDA 的工作流程语言提供了高级自动化、错误处理功能以及灵活的插件模型,允许与任何仿真软件进行接口对接。相关的插件注册表使得扩展的共享变得无缝,激励着致力于使仿真更加稳健、用户友好且可复制的活跃用户社区。本存档记录包含了在论文《AiiDA 1.0:一种可扩展的计算基础设施,用于自动化可复现工作流程和数据溯源》中“事件与轮询式引擎”章节中重现发动机性能图表所需的数据。同时,还包括了使用 AiiDA v1.1.1 和 AiiDA v0.12.5 从零开始重现实际数据的说明。
提供机构:
doi.org
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作