gerbejon/WebClasSeg25-html-nodes-mc-new
收藏Hugging Face2025-02-26 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/gerbejon/WebClasSeg25-html-nodes-mc-new
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了网页节点相关的标注信息,具体字段包括节点标签(node_label)、节点父标签(node_parents_labels)、节点子标签(node_children_labels)、页面ID(page_id)、标注者ID(annotator_id)、节点路径(xpath)等。数据集分为训练集和测试集,每个集合都包含2080005个示例,分别占用的字节数为341079832字节。整个数据集的下载大小为71823258字节,解压后大小为682159664字节。
The dataset includes annotation information related to web page nodes, with specific fields such as node label (node_label), node parent labels (node_parents_labels), node child labels (node_children_labels), page ID (page_id), annotator ID (annotator_id), node path (xpath), etc. The dataset is divided into training and test sets, each containing 2,080,005 examples, each occupying 341,079,832 bytes. The total download size of the dataset is 71,823,258 bytes, and the unpacked size is 682,159,664 bytes.
提供机构:
gerbejon



