Antix5/vi-gym-causal-ascii
收藏Hugging Face2026-03-07 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/Antix5/vi-gym-causal-ascii
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
task_categories:
- text-generation
- reinforcement-learning
language:
- en
tags:
- vi
- vim
- ascii-art
- gym
- causal-lm
pretty_name: Vi-Gym Causal ASCII Trajectories
size_categories:
- 100K<n<1M
---
# Vi-Gym Causal ASCII Trajectories
This dataset contains autoregressive trajectories of a Large Language Model (LLM) agent learning spatial reasoning and geometric drawing within a simulated **Vi (Vim)** editor environment.
## Warning
**This dataset is a direct derivation of the source material, it might therefore also contain content not suitable for all audiences. All authors of the original artwork have full ownership.**
## Dataset Structure
Each record is a discrete step in the environment, capturing the exact state of the editor before a command is issued. The format is designed for **Causal Next-Token Prediction** training.
### Format Specification
```xml
<BOS>
<notepad>
[CURRENT ASCII CONTENT]
</notepad>
<mode>[Normal|Insert]</mode>
<prompt>[Grammatically Correct Instruction]</prompt>
<command>
[OPTIMIZED VI KEYSTROKES]
```
## Technical Provenance
1. **Environment Engine**: States rendered via the Rust-based Vi-Gym engine.
2. **Keystroke Optimization**: Generated using an AST-based compiler prioritizing efficiency via Run-Length Encoding (RLE) and geometric entropy sorting.
3. **Linguistic Robustness**: Prompts utilize grammatically correct indefinite articles (a/an) and randomized natural language templates with human-like noise.
## Source Credits
- **Geometric Data**: ASCII art shapes sourced from the [Curated ASCII Art Database](https://github.com/asweigart/asciiartjsondb) (originally from [asciiart.eu](https://www.asciiart.eu/)).
- **Logic Backend**: Editor state and command interpretation powered by the [ViLM](https://github.com/Antix5/ViLM) core engine (not published yet).
## Training Recommendations
Calculate loss **only** on tokens following the `<command>\n` tag to focus the model on the mapping between visual state and command execution.
提供机构:
Antix5



