LangAGI-Lab/Mind2Web-cleaned-lite-value-model-test
收藏Hugging Face2024-09-18 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/LangAGI-Lab/Mind2Web-cleaned-lite-value-model-test
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: action_uid
dtype: string
- name: operation
dtype: string
- name: pos_candidates
sequence: string
- name: neg_candidates
sequence: string
- name: website
dtype: string
- name: domain
dtype: string
- name: subdomain
dtype: string
- name: annotation_id
dtype: string
- name: confirmed_task
dtype: string
- name: action_reprs
sequence: string
- name: target_action_index
dtype: string
- name: target_action_reprs
dtype: string
- name: action
dtype: string
- name: original_action_repr
dtype: string
- name: original_pos_candidate
struct:
- name: attributes
struct:
- name: alt
dtype: string
- name: aria_description
dtype: string
- name: aria_label
dtype: string
- name: backend_node_id
dtype: string
- name: bounding_box_rect
dtype: string
- name: class
dtype: string
- name: data_pw_testid_buckeye_candidate
dtype: string
- name: id
dtype: string
- name: input_checked
dtype: string
- name: input_value
dtype: string
- name: is_clickable
dtype: string
- name: label
dtype: string
- name: name
dtype: string
- name: placeholder
dtype: string
- name: role
dtype: string
- name: text_value
dtype: string
- name: title
dtype: string
- name: type
dtype: string
- name: value
dtype: string
- name: backend_node_id
dtype: string
- name: is_original_target
dtype: bool
- name: is_top_level_target
dtype: bool
- name: tag
dtype: string
- name: match_type
dtype: string
- name: cleaned_accessibility_tree
dtype: string
- name: previous_actions
sequence: string
- name: cleaned_next_accessibility_tree
dtype: string
- name: next_state_tao
dtype: string
- name: new_items
dtype: string
- name: updated_items
dtype: string
- name: deleted_items
dtype: string
- name: refined_tao
dtype: string
- name: raw_prediction
dtype: string
- name: rationale
dtype: string
- name: next_state_description_with_tao
dtype: string
- name: value_score
dtype: string
splits:
- name: train
num_bytes: 1367938090
num_examples: 6125
download_size: 274000478
dataset_size: 1367938090
---
# Dataset Card for "Mind2Web-cleaned-lite-value-model-test"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
配置项:
- 配置名称:default
数据文件:
- 划分:train
路径:data/train-*
数据集信息:
特征:
- 名称:动作唯一标识符(action_uid),数据类型:字符串
- 名称:操作(operation),数据类型:字符串
- 名称:正样本候选集(pos_candidates),数据类型:字符串序列
- 名称:负样本候选集(neg_candidates),数据类型:字符串序列
- 名称:网站(website),数据类型:字符串
- 名称:域名(domain),数据类型:字符串
- 名称:子域名(subdomain),数据类型:字符串
- 名称:标注ID(annotation_id),数据类型:字符串
- 名称:确认任务(confirmed_task),数据类型:字符串
- 名称:动作表征序列(action_reprs),数据类型:字符串序列
- 名称:目标动作索引(target_action_index),数据类型:字符串
- 名称:目标动作表征(target_action_reprs),数据类型:字符串
- 名称:动作(action),数据类型:字符串
- 名称:原始动作表征(original_action_repr),数据类型:字符串
- 名称:原始正样本候选(original_pos_candidate),数据类型:结构体:
- 名称:属性(attributes),数据类型:结构体:
- 名称:替代文本(alt),数据类型:字符串
- 名称:ARIA(Accessible Rich Internet Applications)描述(aria_description),数据类型:字符串
- 名称:ARIA(Accessible Rich Internet Applications)标签(aria_label),数据类型:字符串
- 名称:后端节点ID(backend_node_id),数据类型:字符串
- 名称:边界框矩形(bounding_box_rect),数据类型:字符串
- 名称:类名(class),数据类型:字符串
- 名称:data_pw_testid_buckeye_candidate,数据类型:字符串
- 名称:标识符(id),数据类型:字符串
- 名称:输入勾选状态(input_checked),数据类型:字符串
- 名称:输入值(input_value),数据类型:字符串
- 名称:是否可点击(is_clickable),数据类型:字符串
- 名称:标签(label),数据类型:字符串
- 名称:名称(name),数据类型:字符串
- 名称:占位符(placeholder),数据类型:字符串
- 名称:角色(role),数据类型:字符串
- 名称:文本值(text_value),数据类型:字符串
- 名称:标题(title),数据类型:字符串
- 名称:类型(type),数据类型:字符串
- 名称:值(value),数据类型:字符串
- 名称:后端节点ID(backend_node_id),数据类型:字符串
- 名称:是否为原始目标(is_original_target),数据类型:布尔值
- 名称:是否为顶级目标(is_top_level_target),数据类型:布尔值
- 名称:标签名(tag),数据类型:字符串
- 名称:匹配类型(match_type),数据类型:字符串
- 名称:清理后的可访问性树(cleaned_accessibility_tree),数据类型:字符串
- 名称:先前动作序列(previous_actions),数据类型:字符串序列
- 名称:清理后的后续可访问性树(cleaned_next_accessibility_tree),数据类型:字符串
- 名称:后续状态TAO(next_state_tao),数据类型:字符串
- 名称:新增项(new_items),数据类型:字符串
- 名称:更新项(updated_items),数据类型:字符串
- 名称:删除项(deleted_items),数据类型:字符串
- 名称:精炼后的TAO(refined_tao),数据类型:字符串
- 名称:原始预测(raw_prediction),数据类型:字符串
- 名称:推理依据(rationale),数据类型:字符串
- 名称:带有TAO的后续状态描述(next_state_description_with_tao),数据类型:字符串
- 名称:值得分(value_score),数据类型:字符串
数据集划分:
- 名称:训练集(train)
占用字节数:1367938090
样本数量:6125
下载大小:274000478
数据集总大小:1367938090
# "Mind2Web-cleaned-lite-value-model-test" 数据集卡片
[需补充更多信息](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
LangAGI-Lab



