five

Kale: A System for Enabling Human-in-the-loop Interactivity in HPC Workflows

收藏
Mendeley Data2024-01-31 更新2024-06-30 收录
下载链接:
https://figshare.com/articles/Kale_A_System_for_Enabling_Human-in-the-loop_Interactivity_in_HPC_Workflows/7067075/2
下载链接
链接失效反馈
官方服务:
资源简介:
Scientific problem-solving frequently requires interactive, iterative exploration and analysis. Web-based interactive electronic notebook interfaces such as Jupyter offer an important mechanism for scientists to capture analyses in a reproducible narrative context. An increasing number of science gateway environments are providing support for Jupyter Notebooks as a means to enable custom, ad-hoc analyses on scientific data. However, Jupyter Notebooks alone are not enough to fulfill the needs of scientific researchers today. Scientists are producing and consuming large amounts of data, and require significant computational resources to process and analyze that data, causing scientific workflows to become increasingly asynchronous in nature as processing is off-loaded to remote resources. Many scientific researchers turn to HPC systems for processing, but the traditional asynchronous batch-queue environment used in HPC for such computationally intensive tasks is largely separate from interactive Notebook-based workflows, producing a fragmented workflow for scientists that does not facilitate rapid scientific inquiry. We introduce our system “Kale” that enables Jupyter Notebooks to seamlessly interface with HPC workflows, leveraging distributed computational resources for iterative human-in-the-loop scientific exploration.

科学问题求解往往需要交互式、迭代式的探索与分析。基于Web的交互式电子笔记本界面(如Jupyter笔记本(Jupyter Notebooks))为科学家提供了一种重要机制,使其能够在可复现的叙事性语境中记录分析过程。越来越多的科学网关环境正在为Jupyter笔记本提供支持,以此实现针对科学数据的定制化、临时性分析。然而,仅依靠Jupyter笔记本已无法满足当代科学研究者的需求。当前科学家们正产生并使用海量数据,需要大量计算资源来处理与分析这些数据,这使得科学工作流在本质上愈发异步化——因为计算任务被分流至远程资源。许多科学研究者会转向高性能计算(High Performance Computing, HPC)系统开展处理工作,但HPC领域用于此类计算密集型任务的传统异步批处理队列环境,在很大程度上与基于交互式笔记本的工作流相互割裂,导致科学家的工作流呈现碎片化状态,不利于快速开展科学探究。我们推出了名为"Kale"的系统,该系统可使Jupyter笔记本与HPC工作流实现无缝对接,借助分布式计算资源支撑迭代式的人机协同科学探索。
创建时间:
2024-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作