KomeijiForce/patent_abstract
收藏Hugging Face2024-05-24 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/KomeijiForce/patent_abstract
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: text
dtype: string
- name: label
dtype:
class_label:
names:
'0': Human Necessities
'1': Performing Operations; Transporting
'2': Chemistry; Metallurgy
'3': Textiles; Paper
'4': Fixed Constructions
'5': Mechanical Engineering; Lightning; Heating; Weapons; Blasting
'6': Physics
'7': Electricity
'8': General tagging of new or cross-sectional technology
splits:
- name: train
num_bytes: 17225101
num_examples: 25000
- name: validation
num_bytes: 3472854
num_examples: 5000
- name: test
num_bytes: 3456733
num_examples: 5000
download_size: 12067953
dataset_size: 24154688
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
- split: test
path: data/test-*
---
提供机构:
KomeijiForce
原始信息汇总
数据集概述
数据集特征
- text:文本数据,数据类型为字符串。
- label:标签数据,数据类型为分类标签,具体分类如下:
- 0: Human Necessities
- 1: Performing Operations; Transporting
- 2: Chemistry; Metallurgy
- 3: Textiles; Paper
- 4: Fixed Constructions
- 5: Mechanical Engineering; Lightning; Heating; Weapons; Blasting
- 6: Physics
- 7: Electricity
- 8: General tagging of new or cross-sectional technology
数据集划分
- train:训练集,包含25000个样本,总大小为17225101字节。
- validation:验证集,包含5000个样本,总大小为3472854字节。
- test:测试集,包含5000个样本,总大小为3456733字节。
数据集大小
- 下载大小:12067953字节。
- 数据集总大小:24154688字节。
数据文件配置
- config_name:default
- data_files:
- train:路径为
data/train-* - validation:路径为
data/validation-* - test:路径为
data/test-*
- train:路径为



