five

sumukshashidhar-archive/UltraSelect-Web

收藏
Hugging Face2025-06-14 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/sumukshashidhar-archive/UltraSelect-Web
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: all features: - name: content dtype: string - name: openbmb-fasttext-classifier-score dtype: float64 - name: source dtype: string - name: fineweb-edu-classifier-score dtype: float64 splits: - name: train num_bytes: 40498297071 num_examples: 10143687 download_size: 23435578665 dataset_size: 40498297071 - config_name: bronze features: - name: content dtype: string - name: openbmb-fasttext-classifier-score dtype: float64 - name: source dtype: string - name: fineweb-edu-classifier-score dtype: float64 splits: - name: train num_bytes: 10124575265.8658 num_examples: 2535922 download_size: 5264392650 dataset_size: 10124575265.8658 - config_name: gold features: - name: content dtype: string - name: openbmb-fasttext-classifier-score dtype: float64 - name: source dtype: string - name: fineweb-edu-classifier-score dtype: float64 splits: - name: train num_bytes: 5701237.451174115 num_examples: 1428 download_size: 2661805 dataset_size: 5701237.451174115 - config_name: platinum features: - name: content dtype: string - name: openbmb-fasttext-classifier-score dtype: float64 - name: source dtype: string - name: fineweb-edu-classifier-score dtype: float64 splits: - name: train num_bytes: 139736.21203858123 num_examples: 35 download_size: 89763 dataset_size: 139736.21203858123 - config_name: silver features: - name: content dtype: string - name: openbmb-fasttext-classifier-score dtype: float64 - name: source dtype: string - name: fineweb-edu-classifier-score dtype: float64 splits: - name: train num_bytes: 4049834897.302161 num_examples: 1014370 download_size: 2192607766 dataset_size: 4049834897.302161 configs: - config_name: all data_files: - split: train path: all/train-* - config_name: bronze data_files: - split: train path: bronze/train-* - config_name: gold data_files: - split: train path: gold/train-* - config_name: platinum data_files: - split: train path: platinum/train-* - config_name: silver data_files: - split: train path: silver/train-* ---
提供机构:
sumukshashidhar-archive
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作