five

nuprl-staging/AgentPack-training-js

收藏
Hugging Face2025-11-17 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/nuprl-staging/AgentPack-training-js
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: event_id dtype: string - name: agent dtype: string - name: repo dtype: string - name: sha dtype: string - name: description dtype: string - name: patch dtype: string - name: file_rows list: - name: file_path dtype: string - name: new_contents dtype: string - name: old_contents dtype: string - name: file list: string - name: old_contents list: string - name: new_contents list: string - name: messages list: - name: content dtype: string - name: role dtype: string splits: - name: js_ts num_bytes: 7257112529 num_examples: 169868 - name: js num_bytes: 3121111272 num_examples: 72522 download_size: 3920977846 dataset_size: 10378223801 configs: - config_name: default data_files: - split: js_ts path: data/js_ts-* - split: js path: data/js-* --- This dataset is the javascript/typescript subset of AgentPack, picked for training purposes. 2 splits exist: - `js_ts`: repos with at least one both js(x) and ts(x) files - `js`: repos with at least one js(x) files Both datasets are built using `sft/preprocess_agentpack.py` using the following flags: ``` --max_examples 500000 \ --file_exts js jsx \ # add ts tsx for `js_ts` split --exclude_file_exts md txt rst org log \ --file_ext_filter_by repo \ --max_length_chars 50000 ```
提供机构:
nuprl-staging
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作