five

mishig/parameter-golf

收藏
Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/mishig/parameter-golf
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: odc-by pretty_name: Parameter Golf FineWeb Export source_datasets: - HuggingFaceFW/fineweb dataset_info: features: - name: text dtype: string splits: - name: train num_examples: 15318808 - name: validation num_examples: 50000 --- # Parameter Golf FineWeb Export This is a Parquet-converted fork of [`willdepueoai/parameter-golf`](https://huggingface.co/datasets/willdepueoai/parameter-golf), reformatted for compatibility with the HuggingFace dataset viewer and the `datasets` library. The original repository hosts export artifacts derived from [`HuggingFaceFW/fineweb`](https://huggingface.co/datasets/HuggingFaceFW/fineweb), specifically a ~10B-token subset pulled from the 100B FineWeb cut used for parameter-golf experiments. ## Dataset Structure - **train**: 15,318,808 documents - **validation**: 50,000 documents - Each example contains a single `text` field with the document content. ## Usage ```python from datasets import load_dataset ds = load_dataset("mishig/parameter-golf") ``` ## License This dataset is made available under the Open Data Commons Attribution License (ODC-By) v1.0, consistent with the upstream FineWeb dataset license. The upstream FineWeb dataset card states that FineWeb is released under ODC-By v1.0. Use of this derived dataset is also subject to the Common Crawl Terms of Use referenced by FineWeb. ## Upstream References - Original dataset: <https://huggingface.co/datasets/willdepueoai/parameter-golf> - FineWeb dataset: <https://huggingface.co/datasets/HuggingFaceFW/fineweb> - ODC-By v1.0: <https://opendatacommons.org/licenses/by/1-0/> - Common Crawl Terms of Use: <https://commoncrawl.org/terms-of-use> ## Attribution Please attribute the upstream FineWeb dataset and this derived export when redistributing or using these artifacts.
提供机构:
mishig
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作