five

jxm/synthbio

收藏
Hugging Face2023-11-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jxm/synthbio
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: serialized_attrs dtype: string - name: biographies sequence: string - name: notable_type dtype: string - name: attrs struct: - name: Bronze dtype: string - name: Gold dtype: string - name: Gold, 1984 dtype: string - name: Gold, 1988 dtype: string - name: Gold, 1992 dtype: string - name: Gold, 1994 dtype: string - name: Gold, 1996 dtype: string - name: Gold, 1998 dtype: string - name: Gold, 2002 dtype: string - name: Gold, 2004 dtype: string - name: Self-portrait of Toma Klima (2001) dtype: string - name: Silver, 2006 dtype: string - name: Silver, 2007 dtype: string - name: agency dtype: string - name: alias dtype: string - name: allegiance dtype: string - name: alma_mater dtype: string - name: associated_acts dtype: string - name: awards dtype: string - name: birth_date dtype: string - name: birth_name dtype: string - name: birth_place dtype: string - name: children dtype: string - name: citizenship dtype: string - name: coach dtype: string - name: codename dtype: string - name: collegeteam dtype: string - name: country dtype: string - name: criminal_penalty dtype: string - name: death_cause dtype: string - name: death_date dtype: string - name: death_place dtype: string - name: doctoral_advisor dtype: string - name: education dtype: string - name: elected dtype: string - name: event dtype: string - name: father dtype: string - name: fields dtype: string - name: final_ascent dtype: string - name: gender dtype: string - name: genre dtype: string - name: height dtype: string - name: hometown dtype: string - name: influenced dtype: string - name: influences dtype: string - name: institutions dtype: string - name: instrument dtype: string - name: known_for dtype: string - name: label dtype: string - name: language dtype: string - name: main_interests dtype: string - name: mother dtype: string - name: movement dtype: string - name: name dtype: string - name: national_team dtype: string - name: nationality dtype: string - name: notable_ascents dtype: string - name: notable_students dtype: string - name: notable_works dtype: string - name: occupation dtype: string - name: olympics dtype: string - name: operation dtype: string - name: paralympics dtype: string - name: partner dtype: string - name: partnerships dtype: string - name: position dtype: string - name: resting_place dtype: string - name: retired dtype: string - name: serviceyears dtype: string - name: sport dtype: string - name: start_age dtype: string - name: thesis_title dtype: string - name: thesis_year dtype: string - name: tradition_movement dtype: string - name: weight dtype: string - name: worlds dtype: string - name: years_active dtype: string splits: - name: train num_bytes: 5581070 num_examples: 2237 download_size: 2360383 dataset_size: 5581070 configs: - config_name: default data_files: - split: train path: data/train-* --- # Dataset Card for "synthbio" [More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
jxm
原始信息汇总

数据集概述

数据特征

  • serialized_attrs: 字符串类型
  • biographies: 字符串序列
  • notable_type: 字符串类型
  • attrs: 结构体类型,包含以下字段:
    • Bronze: 字符串类型
    • Gold: 字符串类型
    • Gold, 1984: 字符串类型
    • Gold, 1988: 字符串类型
    • Gold, 1992: 字符串类型
    • Gold, 1994: 字符串类型
    • Gold, 1996: 字符串类型
    • Gold, 1998: 字符串类型
    • Gold, 2002: 字符串类型
    • Gold, 2004: 字符串类型
    • Self-portrait of Toma Klima (2001): 字符串类型
    • Silver, 2006: 字符串类型
    • Silver, 2007: 字符串类型
    • agency: 字符串类型
    • alias: 字符串类型
    • allegiance: 字符串类型
    • alma_mater: 字符串类型
    • associated_acts: 字符串类型
    • awards: 字符串类型
    • birth_date: 字符串类型
    • birth_name: 字符串类型
    • birth_place: 字符串类型
    • children: 字符串类型
    • citizenship: 字符串类型
    • coach: 字符串类型
    • codename: 字符串类型
    • collegeteam: 字符串类型
    • country: 字符串类型
    • criminal_penalty: 字符串类型
    • death_cause: 字符串类型
    • death_date: 字符串类型
    • death_place: 字符串类型
    • doctoral_advisor: 字符串类型
    • education: 字符串类型
    • elected: 字符串类型
    • event: 字符串类型
    • father: 字符串类型
    • fields: 字符串类型
    • final_ascent: 字符串类型
    • gender: 字符串类型
    • genre: 字符串类型
    • height: 字符串类型
    • hometown: 字符串类型
    • influenced: 字符串类型
    • influences: 字符串类型
    • institutions: 字符串类型
    • instrument: 字符串类型
    • known_for: 字符串类型
    • label: 字符串类型
    • language: 字符串类型
    • main_interests: 字符串类型
    • mother: 字符串类型
    • movement: 字符串类型
    • name: 字符串类型
    • national_team: 字符串类型
    • nationality: 字符串类型
    • notable_ascents: 字符串类型
    • notable_students: 字符串类型
    • notable_works: 字符串类型
    • occupation: 字符串类型
    • olympics: 字符串类型
    • operation: 字符串类型
    • paralympics: 字符串类型
    • partner: 字符串类型
    • partnerships: 字符串类型
    • position: 字符串类型
    • resting_place: 字符串类型
    • retired: 字符串类型
    • serviceyears: 字符串类型
    • sport: 字符串类型
    • start_age: 字符串类型
    • thesis_title: 字符串类型
    • thesis_year: 字符串类型
    • tradition_movement: 字符串类型
    • weight: 字符串类型
    • worlds: 字符串类型
    • years_active: 字符串类型

数据分割

  • train: 包含2237个样本,占用5581070字节

数据集大小

  • 下载大小: 2360383字节
  • 数据集大小: 5581070字节

配置

  • default: 包含训练数据文件,路径为data/train-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作