jensjorisdecorte/skill-extraction-house
收藏数据集概述
数据集名称
- 名称: skill-extraction-house
- 别名: jensjorisdecorte/skill-extraction-house, Skill Extraction - HOUSE
数据集描述
- 描述: 该数据集是SkillSpan数据集中HOUSE子集的扩展,其中包含句子中技能提及的跨度,并标记了相应的ESCO技能(ESCO v1.1.0)。该数据集是技能提取的三部分评估数据集之一,包括skill-extraction-tech、skill-extraction-house和skill-extraction-techwolf。
数据集创建者
- 创建者: Jens-Joris Decorte
- URL: https://hf-mirror.com/jensjorisdecorte
数据集关键词
- 关键词: text-classification, English, mit, 1K - 10K, csv, Text, Datasets, pandas, Croissant, arxiv:2209.05987, arxiv:2204.12811, 🇺🇸 Region: US, Skill Extraction
数据集许可
- 许可: MIT License
- URL: https://choosealicense.com/licenses/mit/
数据集URL
- URL: https://hf-mirror.com/datasets/jensjorisdecorte/skill-extraction-house
数据集结构
-
分布:
-
类型: cr:FileObject
-
ID: repo
-
名称: repo
-
描述: The HF Mirror git repository.
-
内容URL: https://hf-mirror.com/datasets/jensjorisdecorte/skill-extraction-house/tree/refs%2Fconvert%2Fparquet
-
编码格式: git+https
-
sha256: https://github.com/mlcommons/croissant/issues/80
-
类型: cr:FileSet
-
ID: parquet-files-for-config-default
-
名称: parquet-files-for-config-default
-
描述: The underlying Parquet files as converted by HF Mirror (see: https://hf-mirror.com/docs/datasets-server/parquet).
-
包含于: repo
-
编码格式: application/x-parquet
-
包含: default//.parquet
-
-
记录集:
- 类型: cr:RecordSet
- ID: default
- 名称: default
- 描述: jensjorisdecorte/skill-extraction-house - default subset
- 2 splits: validation, test
- 字段:
-
类型: cr:Field
-
ID: default/sentence
-
名称: default/sentence
-
描述: Column sentence from the HF Mirror parquet file.
-
数据类型: sc:Text
-
来源:
- 文件集: parquet-files-for-config-default
- 提取: column: sentence
-
类型: cr:Field
-
ID: default/span
-
名称: default/span
-
描述: Column span from the HF Mirror parquet file.
-
数据类型: sc:Text
-
来源:
- 文件集: parquet-files-for-config-default
- 提取: column: span
-
类型: cr:Field
-
ID: default/sub_span
-
名称: default/sub_span
-
描述: Column sub_span from the HF Mirror parquet file.
-
数据类型: sc:Text
-
来源:
- 文件集: parquet-files-for-config-default
- 提取: column: sub_span
-
类型: cr:Field
-
ID: default/label
-
名称: default/label
-
描述: Column label from the HF Mirror parquet file.
-
数据类型: sc:Text
-
来源:
- 文件集: parquet-files-for-config-default
- 提取: column: label
-
数据集符合的标准
- 符合标准: http://mlcommons.org/croissant/1.0



