five

gayanin/woz-noised-with-prob-dist-v2

收藏
Hugging Face2024-02-09 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/gayanin/woz-noised-with-prob-dist-v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: babylon-prob-01 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2630935 num_examples: 20304 - name: test num_bytes: 326013 num_examples: 2538 - name: validation num_bytes: 328959 num_examples: 2539 download_size: 1730992 dataset_size: 3285907 - config_name: babylon-prob-02 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2587653 num_examples: 20304 - name: test num_bytes: 319619 num_examples: 2538 - name: validation num_bytes: 323916 num_examples: 2539 download_size: 1773744 dataset_size: 3231188 - config_name: babylon-prob-03 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2543333 num_examples: 20304 - name: test num_bytes: 314849 num_examples: 2538 - name: validation num_bytes: 318386 num_examples: 2539 download_size: 1803480 dataset_size: 3176568 - config_name: babylon-prob-04 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2498217 num_examples: 20304 - name: test num_bytes: 310289 num_examples: 2538 - name: validation num_bytes: 314195 num_examples: 2539 download_size: 1826199 dataset_size: 3122701 - config_name: babylon-prob-05 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2457853 num_examples: 20304 - name: test num_bytes: 306097 num_examples: 2538 - name: validation num_bytes: 307569 num_examples: 2539 download_size: 1844172 dataset_size: 3071519 - config_name: gcd-prob-01 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2580319 num_examples: 20304 - name: test num_bytes: 326137 num_examples: 2538 - name: validation num_bytes: 314447 num_examples: 2539 download_size: 1672612 dataset_size: 3220903 - config_name: gcd-prob-02 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2488852 num_examples: 20304 - name: test num_bytes: 314869 num_examples: 2538 - name: validation num_bytes: 302499 num_examples: 2539 download_size: 1659272 dataset_size: 3106220 - config_name: gcd-prob-03 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2397420 num_examples: 20304 - name: test num_bytes: 303076 num_examples: 2538 - name: validation num_bytes: 291223 num_examples: 2539 download_size: 1637199 dataset_size: 2991719 - config_name: gcd-prob-04 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2306973 num_examples: 20304 - name: test num_bytes: 291188 num_examples: 2538 - name: validation num_bytes: 280562 num_examples: 2539 download_size: 1608211 dataset_size: 2878723 - config_name: gcd-prob-05 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2217222 num_examples: 20304 - name: test num_bytes: 279583 num_examples: 2538 - name: validation num_bytes: 271343 num_examples: 2539 download_size: 1574265 dataset_size: 2768148 - config_name: kaggle-prob-01 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2575089 num_examples: 20304 - name: test num_bytes: 318605 num_examples: 2538 - name: validation num_bytes: 322727 num_examples: 2538 download_size: 1666313 dataset_size: 3216421 - config_name: kaggle-prob-02 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2490036 num_examples: 20304 - name: test num_bytes: 308433 num_examples: 2538 - name: validation num_bytes: 310492 num_examples: 2538 download_size: 1660616 dataset_size: 3108961 - config_name: kaggle-prob-03 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2404198 num_examples: 20304 - name: test num_bytes: 297671 num_examples: 2538 - name: validation num_bytes: 300859 num_examples: 2538 download_size: 1643086 dataset_size: 3002728 - config_name: kaggle-prob-04 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2316877 num_examples: 20304 - name: test num_bytes: 286791 num_examples: 2538 - name: validation num_bytes: 289662 num_examples: 2538 download_size: 1618093 dataset_size: 2893330 - config_name: kaggle-prob-05 features: - name: 'Unnamed: 0' dtype: int64 - name: refs dtype: string - name: trans dtype: string splits: - name: train num_bytes: 2231108 num_examples: 20304 - name: test num_bytes: 276030 num_examples: 2538 - name: validation num_bytes: 279656 num_examples: 2538 download_size: 1589492 dataset_size: 2786794 configs: - config_name: babylon-prob-01 data_files: - split: train path: babylon-prob-01/train-* - split: test path: babylon-prob-01/test-* - split: validation path: babylon-prob-01/validation-* - config_name: babylon-prob-02 data_files: - split: train path: babylon-prob-02/train-* - split: test path: babylon-prob-02/test-* - split: validation path: babylon-prob-02/validation-* - config_name: babylon-prob-03 data_files: - split: train path: babylon-prob-03/train-* - split: test path: babylon-prob-03/test-* - split: validation path: babylon-prob-03/validation-* - config_name: babylon-prob-04 data_files: - split: train path: babylon-prob-04/train-* - split: test path: babylon-prob-04/test-* - split: validation path: babylon-prob-04/validation-* - config_name: babylon-prob-05 data_files: - split: train path: babylon-prob-05/train-* - split: test path: babylon-prob-05/test-* - split: validation path: babylon-prob-05/validation-* - config_name: gcd-prob-01 data_files: - split: train path: gcd-prob-01/train-* - split: test path: gcd-prob-01/test-* - split: validation path: gcd-prob-01/validation-* - config_name: gcd-prob-02 data_files: - split: train path: gcd-prob-02/train-* - split: test path: gcd-prob-02/test-* - split: validation path: gcd-prob-02/validation-* - config_name: gcd-prob-03 data_files: - split: train path: gcd-prob-03/train-* - split: test path: gcd-prob-03/test-* - split: validation path: gcd-prob-03/validation-* - config_name: gcd-prob-04 data_files: - split: train path: gcd-prob-04/train-* - split: test path: gcd-prob-04/test-* - split: validation path: gcd-prob-04/validation-* - config_name: gcd-prob-05 data_files: - split: train path: gcd-prob-05/train-* - split: test path: gcd-prob-05/test-* - split: validation path: gcd-prob-05/validation-* - config_name: kaggle-prob-01 data_files: - split: train path: kaggle-prob-01/train-* - split: test path: kaggle-prob-01/test-* - split: validation path: kaggle-prob-01/validation-* - config_name: kaggle-prob-02 data_files: - split: train path: kaggle-prob-02/train-* - split: test path: kaggle-prob-02/test-* - split: validation path: kaggle-prob-02/validation-* - config_name: kaggle-prob-03 data_files: - split: train path: kaggle-prob-03/train-* - split: test path: kaggle-prob-03/test-* - split: validation path: kaggle-prob-03/validation-* - config_name: kaggle-prob-04 data_files: - split: train path: kaggle-prob-04/train-* - split: test path: kaggle-prob-04/test-* - split: validation path: kaggle-prob-04/validation-* - config_name: kaggle-prob-05 data_files: - split: train path: kaggle-prob-05/train-* - split: test path: kaggle-prob-05/test-* - split: validation path: kaggle-prob-05/validation-* ---
提供机构:
gayanin
原始信息汇总

数据集概述

数据集配置

babylon-prob-01

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2630935 字节, 20304 样本
    • test: 326013 字节, 2538 样本
    • validation: 328959 字节, 2539 样本
  • 下载大小: 1730992 字节
  • 数据集大小: 3285907 字节

babylon-prob-02

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2587653 字节, 20304 样本
    • test: 319619 字节, 2538 样本
    • validation: 323916 字节, 2539 样本
  • 下载大小: 1773744 字节
  • 数据集大小: 3231188 字节

babylon-prob-03

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2543333 字节, 20304 样本
    • test: 314849 字节, 2538 样本
    • validation: 318386 字节, 2539 样本
  • 下载大小: 1803480 字节
  • 数据集大小: 3176568 字节

babylon-prob-04

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2498217 字节, 20304 样本
    • test: 310289 字节, 2538 样本
    • validation: 314195 字节, 2539 样本
  • 下载大小: 1826199 字节
  • 数据集大小: 3122701 字节

babylon-prob-05

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2457853 字节, 20304 样本
    • test: 306097 字节, 2538 样本
    • validation: 307569 字节, 2539 样本
  • 下载大小: 1844172 字节
  • 数据集大小: 3071519 字节

gcd-prob-01

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2580319 字节, 20304 样本
    • test: 326137 字节, 2538 样本
    • validation: 314447 字节, 2539 样本
  • 下载大小: 1672612 字节
  • 数据集大小: 3220903 字节

gcd-prob-02

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2488852 字节, 20304 样本
    • test: 314869 字节, 2538 样本
    • validation: 302499 字节, 2539 样本
  • 下载大小: 1659272 字节
  • 数据集大小: 3106220 字节

gcd-prob-03

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2397420 字节, 20304 样本
    • test: 303076 字节, 2538 样本
    • validation: 291223 字节, 2539 样本
  • 下载大小: 1637199 字节
  • 数据集大小: 2991719 字节

gcd-prob-04

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2306973 字节, 20304 样本
    • test: 291188 字节, 2538 样本
    • validation: 280562 字节, 2539 样本
  • 下载大小: 1608211 字节
  • 数据集大小: 2878723 字节

gcd-prob-05

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2217222 字节, 20304 样本
    • test: 279583 字节, 2538 样本
    • validation: 271343 字节, 2539 样本
  • 下载大小: 1574265 字节
  • 数据集大小: 2768148 字节

kaggle-prob-01

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2575089 字节, 20304 样本
    • test: 318605 字节, 2538 样本
    • validation: 322727 字节, 2538 样本
  • 下载大小: 1666313 字节
  • 数据集大小: 3216421 字节

kaggle-prob-02

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2490036 字节, 20304 样本
    • test: 308433 字节, 2538 样本
    • validation: 310492 字节, 2538 样本
  • 下载大小: 1660616 字节
  • 数据集大小: 3108961 字节

kaggle-prob-03

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2404198 字节, 20304 样本
    • test: 297671 字节, 2538 样本
    • validation: 300859 字节, 2538 样本
  • 下载大小: 1643086 字节
  • 数据集大小: 3002728 字节

kaggle-prob-04

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2316877 字节, 20304 样本
    • test: 286791 字节, 2538 样本
    • validation: 289662 字节, 2538 样本
  • 下载大小: 1618093 字节
  • 数据集大小: 2893330 字节

kaggle-prob-05

  • 特征:
    • Unnamed: 0: int64
    • refs: string
    • trans: string
  • 分割:
    • train: 2231108 字节, 20304 样本
    • test: 276030 字节, 2538 样本
    • validation: 279656 字节, 2538 样本
  • 下载大小: 1589492 字节
  • 数据集大小: 2786794 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作