KETI-AIR/kor_dbpedia_14
收藏Hugging Face2023-12-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/KETI-AIR/kor_dbpedia_14
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
dataset_info:
features:
- name: data_index_by_user
dtype: int32
- name: title
dtype: string
- name: content
dtype: string
- name: label
dtype: int32
splits:
- name: train
num_bytes: 207331112
num_examples: 560000
- name: test
num_bytes: 25970187
num_examples: 70000
download_size: 136871622
dataset_size: 233301299
license: cc-by-sa-3.0
---
# Dataset Card for "kor_dbpedia_14"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
# Source Data Citation Information
```
Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).
Lehmann, Jens, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann et al. "DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia." Semantic web 6, no. 2 (2015): 167-195.
```
提供机构:
KETI-AIR
原始信息汇总
数据集概述
数据集配置
- 默认配置:
- 训练集:路径为
data/train-* - 测试集:路径为
data/test-*
- 训练集:路径为
数据集信息
-
特征:
data_index_by_user:数据类型为int32title:数据类型为stringcontent:数据类型为stringlabel:数据类型为int32
-
数据分割:
- 训练集:
- 字节数:207331112
- 样本数:560000
- 测试集:
- 字节数:25970187
- 样本数:70000
- 训练集:
-
数据集大小:
- 下载大小:136871622 字节
- 数据集大小:233301299 字节
许可
- 许可证:cc-by-sa-3.0



