growth-cadet/Newmod_signals-deparment_split-newv1v2v3-2k

Name: growth-cadet/Newmod_signals-deparment_split-newv1v2v3-2k
Creator: growth-cadet
Published: 2024-06-16 17:44:26
License: 暂无描述

Hugging Face2024-06-16 更新2024-06-29 收录

下载链接：

https://hf-mirror.com/datasets/growth-cadet/Newmod_signals-deparment_split-newv1v2v3-2k

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含多个字段，如ats、context、sys5_obj等，每个字段都有其特定的数据类型和结构。sys5_obj字段包含focus_areas、industries和products_and_technologies三个子字段，每个子字段又包含description和subject两个部分。此外，数据集还包含eval_crit和eval_values字段，用于评估不同类别的得分和值。数据集的大小为71507693字节，包含2228个示例，下载大小为29818325字节。

The dataset contains multiple fields such as ats, context, sys5_obj, etc., each with its specific data type and structure. The sys5_obj field includes three subfields: focus_areas, industries, and products_and_technologies, each containing description and subject parts. Additionally, the dataset includes eval_crit and eval_values fields for evaluating scores and values across different categories. The dataset size is 71507693 bytes, containing 2228 examples, with a download size of 29818325 bytes.

提供机构：

growth-cadet

原始信息汇总

数据集概述

数据集特征

ats: 字符串类型
context: 字符串类型
sys5_obj: 结构体类型，包含以下列表：
- focus_areas: 包含 description 和 subject，均为字符串类型
- industries: 包含 description 和 subject，均为字符串类型
- products_and_technologies: 包含 description 和 subject，均为字符串类型
eval_crit: 结构体类型，包含以下字段：
- focus_areas: 浮点数类型
- industries: 浮点数类型
- products_and_technologies: 浮点数类型
eval_values: 结构体类型，包含以下序列：
- focus_areas: 整数类型
- industries: 整数类型
- products_and_technologies: 整数类型
uuid: 字符串类型
mod_sys5_obj: 字符串类型
gpt-3.5-turbo_cost: 浮点数类型
prompt: 字符串类型
raw_output: 字符串类型
deparment_obj: 字符串类型
gpt-4-turbo_cost: 浮点数类型
sysdep_obj: 结构体类型，包含以下字段：
- deparment: 结构体类型，包含 inferred（布尔类型）和 jobrole_deparment（字符串类型）
- focus_areas: 包含 description 和 subject，均为字符串类型
- industries: 包含 description 和 subject，均为字符串类型
- products_and_technologies: 包含 description 和 subject，均为字符串类型
prompt_dep: 字符串类型
raw_output_inf_dep: 字符串类型
mod_sysdep_obj: 结构体类型，包含以下字段：
- department: 结构体类型，包含 inferred（布尔类型）、team（字符串类型）和 toplevel_department（字符串类型）
- focus_areas: 包含 description 和 subject，均为字符串类型
- industries: 包含 description 和 subject，均为字符串类型
- products_and_technologies: 包含 description 和 subject，均为字符串类型
mod_mod_sysdep_obj_raw: 字符串类型
department: 字符串类型
mod_dep_raw: 字符串类型
mod_answer: 字符串类型
mod_p&t_mod_answer_raw: 字符串类型
mod_p&t_mod_answer_full: 字符串类型

数据集分割

train: 包含 2228 个样本，总字节数为 71507693

数据集大小

下载大小: 29818325 字节
数据集大小: 71507693 字节

配置

default: 包含训练数据文件，路径为 data/train-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集