AndersGiovanni/10-dim
收藏Hugging Face2024-01-25 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/AndersGiovanni/10-dim
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: mit
size_categories:
- 1K<n<10K
task_categories:
- text-classification
pretty_name: 10 Social Dimensions
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: validation
path: data/validation-*
- split: test
path: data/test-*
dataset_info:
features:
- name: text
dtype: string
- name: labels
sequence: int64
splits:
- name: train
num_bytes: 2237355.6300445576
num_examples: 5498
- name: validation
num_bytes: 479375.2150222788
num_examples: 1178
- name: test
num_bytes: 479782.1549331636
num_examples: 1179
download_size: 1723668
dataset_size: 3196513.0
---
# Dataset Card for "10-dim"
### Map labels to strings
```python
# Here's the list of labels and mappings between id and label.
labels = [
"social_support",
"conflict",
"trust",
"fun",
"similarity",
"identity",
"respect",
"romance",
"knowledge",
"power",
]
id2label = {i: label for i, label in enumerate(labels)}
label2id = {label: i for i, label in enumerate(labels)}
# Given an examples, this is how you map
sample = {
"text": "This is just a made up text"
"labels": [0, 0, 0, 1, 0, 0, 0, 0, 0, 1]
}
labels_str = [id2label[i] for i, label in enumerate(sample['labels']) if label == 1]
```
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
AndersGiovanni
原始信息汇总
数据集概述
基本信息
- 语言: 英语
- 许可证: MIT
- 数据集大小: 1K<n<10K
- 任务类别: 文本分类
- 数据集名称: 10 Social Dimensions
配置
- 配置名称: default
- 数据文件:
- 训练集: data/train-*
- 验证集: data/validation-*
- 测试集: data/test-*
- 数据文件:
数据集信息
- 特征:
- 文本: 字符串类型
- 标签: 整数序列类型
- 分割:
- 训练集:
- 字节数: 2237355.6300445576
- 样本数: 5498
- 验证集:
- 字节数: 479375.2150222788
- 样本数: 1178
- 测试集:
- 字节数: 479782.1549331636
- 样本数: 1179
- 训练集:
- 下载大小: 1723668
- 数据集大小: 3196513.0
标签映射
-
标签列表:
- "social_support"
- "conflict"
- "trust"
- "fun"
- "similarity"
- "identity"
- "respect"
- "romance"
- "knowledge"
- "power"
-
id到标签的映射: python id2label = {i: label for i, label in enumerate(labels)}
-
标签到id的映射: python label2id = {label: i for i, label in enumerate(labels)}
-
示例映射: python sample = { "text": "This is just a made up text", "labels": [0, 0, 0, 1, 0, 0, 0, 0, 0, 1] } labels_str = [id2label[i] for i, label in enumerate(sample[labels]) if label == 1]



