five

adeo1/mathlib_informal_v4.28.0

收藏
Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/adeo1/mathlib_informal_v4.28.0
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: mathlib_informal_v4.28.0 license: mit language: - en - code size_categories: - 100K<n<1M task_categories: - text-retrieval source_datasets: - original tags: - text - code - datasets - lean - mathlib - formal-math - retrieval configs: - config_name: default default: true data_files: - split: train path: mathlib_informal_v4.28.0.jsonl dataset_info: features: - name: doc_id dtype: string - name: full_name dtype: string - name: short_name dtype: string - name: module_name dtype: string - name: file_path dtype: string - name: kind dtype: string - name: signature dtype: string - name: docstring dtype: string - name: formal_statement dtype: string - name: informal_statement dtype: string - name: informal_name dtype: string - name: search_text_formal dtype: string - name: search_text_informal dtype: string - name: search_text_joint dtype: string - name: dependency_names sequence: string - name: imported_module_names sequence: string - name: importance_score dtype: float64 - name: source_corpus dtype: string splits: - name: train num_examples: 183489 num_bytes: 390888788 --- # mathlib_informal_v4.28.0 ## Dataset Summary This dataset contains Lean `v4.28.0` mathlib declarations informalized with the repo's external LeanSearch-based pipeline and published in the retrieval schema used by this codebase. ## What Is Included - `mathlib_informal_v4.28.0.jsonl`: one JSON object per declaration - `dataset_metadata.json`: supplemental provenance, schema, and checksum metadata ## Cleaning And Normalization Machine-local paths were removed from the published rows. The exported dataset keeps the retrieval fields intact, while normalizing: - `file_path` to `Mathlib/...` - `source_corpus` to `Mathlib` - `doc_id` to `<file_path>:<full_name>` ## Loading ```python from datasets import load_dataset ds = load_dataset("adeo1/mathlib_informal_v4.28.0", split="train") print(ds[0]) ``` ## Build Provenance - Lean version: `v4.28.0` - Mathlib version: `v4.28.0` - Build source: `leansearch` - LeanSearch commit: `d461a7669842bd70dc84670e0bdce27c2fe478bd` - Records: `183489`
提供机构:
adeo1
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作