yongchanskii/text-data-various-domain
收藏Hugging Face2023-12-15 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/yongchanskii/text-data-various-domain
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: default
features:
- name: docId
dtype: string
- name: category
dtype: string
- name: domainTag
dtype: string
- name: text
dtype: string
- name: __index_level_0__
dtype: int64
splits:
- name: train
num_bytes: 3898906.4
num_examples: 12000
- name: test
num_bytes: 974726.6
num_examples: 3000
download_size: 2812933
dataset_size: 4873633.0
- config_name: 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
features:
- name: docId
dtype: string
- name: category
dtype: string
- name: domainTag
dtype: string
- name: text
dtype: string
splits:
- name: train
num_bytes: 830349677
num_examples: 2608866
- name: test
num_bytes: 207814022
num_examples: 652217
download_size: 624238878
dataset_size: 1038163699
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
- config_name: 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
data_files:
- split: train
path: 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/train-*
- split: test
path: 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/test-*
---
# Dataset Card for "text-data-various-domain"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
yongchanskii
原始信息汇总
数据集概述
配置信息
-
配置名称: default
- 特征:
- docId: string
- category: string
- domainTag: string
- text: string
- index_level_0: int64
- 分割:
- train:
- 字节数: 3898906.4
- 样本数: 12000
- test:
- 字节数: 974726.6
- 样本数: 3000
- train:
- 下载大小: 2812933
- 数据集大小: 4873633.0
- 特征:
-
配置名称: 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
- 特征:
- docId: string
- category: string
- domainTag: string
- text: string
- 分割:
- train:
- 字节数: 830349677
- 样本数: 2608866
- test:
- 字节数: 207814022
- 样本数: 652217
- train:
- 下载大小: 624238878
- 数据集大小: 1038163699
- 特征:
数据文件路径
-
配置名称: default
- 训练数据路径: data/train-*
- 测试数据路径: data/test-*
-
配置名称: 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
- 训练数据路径: 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/train-*
- 测试数据路径: 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/test-*



