jjzha/gnehm
收藏Hugging Face2023-09-07 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/jjzha/gnehm
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
language: de
---
This is the skill dataset created by:
```
@inproceedings{gnehm-etal-2022-fine,
title = "Fine-Grained Extraction and Classification of Skill Requirements in {G}erman-Speaking Job Ads",
author = {Gnehm, Ann-sophie and
B{\"u}hlmann, Eva and
Buchs, Helen and
Clematide, Simon},
booktitle = "Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS)",
month = nov,
year = "2022",
address = "Abu Dhabi, UAE",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.nlpcss-1.2",
doi = "10.18653/v1/2022.nlpcss-1.2",
pages = "14--24",
}
```
There are document delimiters indicated by `idx`.
Number of samples (sentences):
- train: 19889
- dev: 2332
- test: 2557
Sources:
- Swiss Job Market Monitor (SJMM): https://www.swissubase.ch/en/
Type of tags:
- BI(-ICT) and O tags with keys `tags_skill`
Sample:
```
{
"idx": 198,
"tokens": ["-", "besitzen", "fundierte", "Anwenderkenntnisse", "in", "MS-Office"],
"tags_skill": ["O", "O", "O", "O", "O", "B-ICT"]
}
```
提供机构:
jjzha
原始信息汇总
数据集概述
基本信息
- 许可证: cc-by-nc-sa-4.0
- 语言: 德语
数据集创建
- 创建者: Ann-sophie Gnehm, Eva Bühlmann, Helen Buchs, Simon Clematide
- 来源: 第五届自然语言处理与计算社会科学研讨会(NLP+CSS)
- 标题: Fine-Grained Extraction and Classification of Skill Requirements in German-Speaking Job Ads
- 出版时间: 2022年11月
- 出版地点: 阿布扎比, 阿联酋
- 出版机构: 计算语言学协会
- 论文链接: Fine-Grained Extraction and Classification of Skill Requirements in German-Speaking Job Ads
- DOI: 10.18653/v1/2022.nlpcss-1.2
数据集结构
- 样本数量:
- 训练集: 19889 句
- 开发集: 2332 句
- 测试集: 2557 句
数据来源
- 来源: Swiss Job Market Monitor (SJMM)
- 链接: Swiss Job Market Monitor
标签类型
- 标签类型: BI(-ICT) 和 O 标签
- 标签键:
tags_skill
样本示例
json { "idx": 198, "tokens": ["-", "besitzen", "fundierte", "Anwenderkenntnisse", "in", "MS-Office"], "tags_skill": ["O", "O", "O", "O", "O", "B-ICT"] }



