jxm/synthbio
收藏Hugging Face2023-11-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jxm/synthbio
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: serialized_attrs
dtype: string
- name: biographies
sequence: string
- name: notable_type
dtype: string
- name: attrs
struct:
- name: Bronze
dtype: string
- name: Gold
dtype: string
- name: Gold, 1984
dtype: string
- name: Gold, 1988
dtype: string
- name: Gold, 1992
dtype: string
- name: Gold, 1994
dtype: string
- name: Gold, 1996
dtype: string
- name: Gold, 1998
dtype: string
- name: Gold, 2002
dtype: string
- name: Gold, 2004
dtype: string
- name: Self-portrait of Toma Klima (2001)
dtype: string
- name: Silver, 2006
dtype: string
- name: Silver, 2007
dtype: string
- name: agency
dtype: string
- name: alias
dtype: string
- name: allegiance
dtype: string
- name: alma_mater
dtype: string
- name: associated_acts
dtype: string
- name: awards
dtype: string
- name: birth_date
dtype: string
- name: birth_name
dtype: string
- name: birth_place
dtype: string
- name: children
dtype: string
- name: citizenship
dtype: string
- name: coach
dtype: string
- name: codename
dtype: string
- name: collegeteam
dtype: string
- name: country
dtype: string
- name: criminal_penalty
dtype: string
- name: death_cause
dtype: string
- name: death_date
dtype: string
- name: death_place
dtype: string
- name: doctoral_advisor
dtype: string
- name: education
dtype: string
- name: elected
dtype: string
- name: event
dtype: string
- name: father
dtype: string
- name: fields
dtype: string
- name: final_ascent
dtype: string
- name: gender
dtype: string
- name: genre
dtype: string
- name: height
dtype: string
- name: hometown
dtype: string
- name: influenced
dtype: string
- name: influences
dtype: string
- name: institutions
dtype: string
- name: instrument
dtype: string
- name: known_for
dtype: string
- name: label
dtype: string
- name: language
dtype: string
- name: main_interests
dtype: string
- name: mother
dtype: string
- name: movement
dtype: string
- name: name
dtype: string
- name: national_team
dtype: string
- name: nationality
dtype: string
- name: notable_ascents
dtype: string
- name: notable_students
dtype: string
- name: notable_works
dtype: string
- name: occupation
dtype: string
- name: olympics
dtype: string
- name: operation
dtype: string
- name: paralympics
dtype: string
- name: partner
dtype: string
- name: partnerships
dtype: string
- name: position
dtype: string
- name: resting_place
dtype: string
- name: retired
dtype: string
- name: serviceyears
dtype: string
- name: sport
dtype: string
- name: start_age
dtype: string
- name: thesis_title
dtype: string
- name: thesis_year
dtype: string
- name: tradition_movement
dtype: string
- name: weight
dtype: string
- name: worlds
dtype: string
- name: years_active
dtype: string
splits:
- name: train
num_bytes: 5581070
num_examples: 2237
download_size: 2360383
dataset_size: 5581070
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Dataset Card for "synthbio"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
jxm
原始信息汇总
数据集概述
数据特征
- serialized_attrs: 字符串类型
- biographies: 字符串序列
- notable_type: 字符串类型
- attrs: 结构体类型,包含以下字段:
- Bronze: 字符串类型
- Gold: 字符串类型
- Gold, 1984: 字符串类型
- Gold, 1988: 字符串类型
- Gold, 1992: 字符串类型
- Gold, 1994: 字符串类型
- Gold, 1996: 字符串类型
- Gold, 1998: 字符串类型
- Gold, 2002: 字符串类型
- Gold, 2004: 字符串类型
- Self-portrait of Toma Klima (2001): 字符串类型
- Silver, 2006: 字符串类型
- Silver, 2007: 字符串类型
- agency: 字符串类型
- alias: 字符串类型
- allegiance: 字符串类型
- alma_mater: 字符串类型
- associated_acts: 字符串类型
- awards: 字符串类型
- birth_date: 字符串类型
- birth_name: 字符串类型
- birth_place: 字符串类型
- children: 字符串类型
- citizenship: 字符串类型
- coach: 字符串类型
- codename: 字符串类型
- collegeteam: 字符串类型
- country: 字符串类型
- criminal_penalty: 字符串类型
- death_cause: 字符串类型
- death_date: 字符串类型
- death_place: 字符串类型
- doctoral_advisor: 字符串类型
- education: 字符串类型
- elected: 字符串类型
- event: 字符串类型
- father: 字符串类型
- fields: 字符串类型
- final_ascent: 字符串类型
- gender: 字符串类型
- genre: 字符串类型
- height: 字符串类型
- hometown: 字符串类型
- influenced: 字符串类型
- influences: 字符串类型
- institutions: 字符串类型
- instrument: 字符串类型
- known_for: 字符串类型
- label: 字符串类型
- language: 字符串类型
- main_interests: 字符串类型
- mother: 字符串类型
- movement: 字符串类型
- name: 字符串类型
- national_team: 字符串类型
- nationality: 字符串类型
- notable_ascents: 字符串类型
- notable_students: 字符串类型
- notable_works: 字符串类型
- occupation: 字符串类型
- olympics: 字符串类型
- operation: 字符串类型
- paralympics: 字符串类型
- partner: 字符串类型
- partnerships: 字符串类型
- position: 字符串类型
- resting_place: 字符串类型
- retired: 字符串类型
- serviceyears: 字符串类型
- sport: 字符串类型
- start_age: 字符串类型
- thesis_title: 字符串类型
- thesis_year: 字符串类型
- tradition_movement: 字符串类型
- weight: 字符串类型
- worlds: 字符串类型
- years_active: 字符串类型
数据分割
- train: 包含2237个样本,占用5581070字节
数据集大小
- 下载大小: 2360383字节
- 数据集大小: 5581070字节
配置
- default: 包含训练数据文件,路径为
data/train-*



