ShIOEnv_40cmd_7x10K
收藏DataCite Commons2025-05-16 更新2025-05-17 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/BWUIOS
下载链接
链接失效反馈官方服务:
资源简介:
Datasets of Linux command inputs and their observed execution behaviors collected from the ShIOEnv environment for 40 utilities. Each dataset is curated using different methods of argument construction from a defined context-free grammar (CFG):
<ul>
<li> <b> unconstrained random truncated (UCRT) </b>: randomly select productions from the full set of productions, truncated randomly to reduce argument redundancy. </li>
<li> <b> unconstrained policy network (UCPN-m0) </b>: policy network updated using proximal policy optimization over 20,000 episodes with a redundancy score margin of 0, selecting from the full set of productions. </li>
<li> <b> unconstrained policy network (UCPN-m50) </b>: policy network updated using proximal policy optimization over 20,000 episodes with a redundancy score margin of 0.50, selecting from the full set of productions. </li>
<li> <b> grammar-constrained random truncated (GCRT) </b>: randomly select productions from valid expansions, truncated randomly to reduce argument redundancy. </li>
<li> <b> grammar-constrained policy network (GCPN-m0) </b>: policy network updated using proximal policy optimization over 20,000 episodes with a redundancy score margin of 0, selecting from valid expansions. </li>
<li> <b> grammar-constrained policy network (GCPN-m50) </b>: policy network updated using proximal policy optimization over 20,000 episodes with a redundancy score margin of 0.50, selecting from valid expansions. </li>
<li> <b> NL2Bash </b>: Bootstrapped NL2Bash dataset adapted to be executable in the default container provided in ShIOEnv. </li>
</ul>
Refer to dataset_dist_&lt;tag&gt;.png for distributions of each field in each dataset.
The generating environment and agent are available on GitHub: https://github.com/synlab-jragsdale/ShIOEnv/tree/main
提供机构:
Harvard Dataverse
创建时间:
2025-05-16



