mugeakbulut/turkish_Kadi_Sicilleri-ds-mini
收藏Hugging Face2023-12-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/mugeakbulut/turkish_Kadi_Sicilleri-ds-mini
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: DOCNO
dtype: int64
- name: ARXIVID_xml
dtype: string
- name: ARXIVID
dtype: string
- name: Link
dtype: string
- name: Title
dtype: string
- name: Authors
dtype: string
- name: 'Year '
dtype: int64
- name: Submitted on (gün, ay, yıl olarak submission tarihi)
dtype: string
- name: Submission history (v1 de dahil olmak üzere hepsi)
dtype: string
- name: Last revised tarihi
dtype: string
- name: content
dtype: string
- name: Comments
dtype: string
- name: Subject
dtype: string
- name: Journal reference
dtype: string
- name: DOI
dtype: string
- name: Cite as
dtype: string
- name: 'Unnamed: 16'
dtype: float64
- name: 'Unnamed: 17'
dtype: float64
- name: 'Unnamed: 18'
dtype: float64
- name: 'Unnamed: 19'
dtype: float64
- name: 'Unnamed: 20'
dtype: string
- name: Abstract_no
dtype: string
- name: 'Unnamed: 22'
dtype: string
- name: review
dtype: string
- name: content_length
dtype: int64
splits:
- name: train
num_bytes: 1486810.797385621
num_examples: 413
- name: validation
num_bytes: 165601.2026143791
num_examples: 46
download_size: 810073
dataset_size: 1652412.0
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
---
提供机构:
mugeakbulut
原始信息汇总
数据集概述
特征信息
数据集包含以下特征:
- DOCNO: 数据类型为
int64 - ARXIVID_xml: 数据类型为
string - ARXIVID: 数据类型为
string - Link: 数据类型为
string - Title: 数据类型为
string - Authors: 数据类型为
string - Year: 数据类型为
int64 - Submitted on (gün, ay, yıl olarak submission tarihi): 数据类型为
string - Submission history (v1 de dahil olmak üzere hepsi): 数据类型为
string - Last revised tarihi: 数据类型为
string - content: 数据类型为
string - Comments: 数据类型为
string - Subject: 数据类型为
string - Journal reference: 数据类型为
string - DOI: 数据类型为
string - Cite as: 数据类型为
string - Unnamed: 16: 数据类型为
float64 - Unnamed: 17: 数据类型为
float64 - Unnamed: 18: 数据类型为
float64 - Unnamed: 19: 数据类型为
float64 - Unnamed: 20: 数据类型为
string - Abstract_no: 数据类型为
string - Unnamed: 22: 数据类型为
string - review: 数据类型为
string - content_length: 数据类型为
int64
数据分割
数据集分为以下几个部分:
- train: 包含 413 个样本,大小为 1486810.797385621 字节
- validation: 包含 46 个样本,大小为 165601.2026143791 字节
数据大小
- 下载大小: 810073 字节
- 数据集大小: 1652412.0 字节
配置信息
- config_name: default
- data_files:
- train: 路径为
data/train-* - validation: 路径为
data/validation-*
- train: 路径为
- data_files:



