growth-cadet/mod-signals-deparment-574test_strip_evaleval
收藏Hugging Face2024-05-12 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/growth-cadet/mod-signals-deparment-574test_strip_evaleval
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: uuid
dtype: string
- name: ats_id
dtype: string
- name: ats
dtype: string
- name: context
dtype: string
- name: cleaned_context
dtype: string
- name: token_size
dtype: int64
- name: __index_level_0__
dtype: int64
- name: bs4_text
dtype: string
- name: gpt-4-turbo_raw_output
dtype: string
- name: gpt-4-turbo_response
struct:
- name: deparment
struct:
- name: inferred
dtype: bool
- name: jobrole_deparment
dtype: string
- name: focus_areas
list:
- name: description
dtype: string
- name: subject
dtype: string
- name: industries
list:
- name: description
dtype: string
- name: subject
dtype: string
- name: products_and_technologies
list:
- name: description
dtype: string
- name: subject
dtype: string
- name: gpt-4-turbo_cost
dtype: float64
- name: prompt
dtype: string
- name: raw_output
dtype: string
- name: pass_pydantic
dtype: int64
- name: pass_eval_embedd
dtype: int64
splits:
- name: train
num_bytes: 88347695
num_examples: 574
download_size: 33427601
dataset_size: 88347695
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
The dataset includes multiple fields such as uuid, ats_id, ats, etc., each with its specific data type. It also contains structured fields like gpt-4-turbo_response, which includes multiple sub-fields and lists. The dataset is divided into a training set (train) with 574 examples.
提供机构:
growth-cadet
原始信息汇总
数据集概述
数据集特征
- uuid: 字符串类型
- ats_id: 字符串类型
- ats: 字符串类型
- context: 字符串类型
- cleaned_context: 字符串类型
- token_size: 整数类型
- index_level_0: 整数类型
- bs4_text: 字符串类型
- gpt-4-turbo_raw_output: 字符串类型
- gpt-4-turbo_response: 结构体类型,包含以下字段:
- department:
- inferred: 布尔类型
- jobrole_deparment: 字符串类型
- focus_areas: 列表,包含以下字段:
- description: 字符串类型
- subject: 字符串类型
- industries: 列表,包含以下字段:
- description: 字符串类型
- subject: 字符串类型
- products_and_technologies: 列表,包含以下字段:
- description: 字符串类型
- subject: 字符串类型
- department:
- gpt-4-turbo_cost: 浮点数类型
- prompt: 字符串类型
- raw_output: 字符串类型
- pass_pydantic: 整数类型
- pass_eval_embedd: 整数类型
数据集分割
- train:
- num_bytes: 88347695
- num_examples: 574
数据集大小
- download_size: 33427601
- dataset_size: 88347695
配置
- config_name: default
- data_files:
- split: train
- path: data/train-*



