Shahriar/SoAC_Corpus
收藏Hugging Face2025-02-11 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Shahriar/SoAC_Corpus
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了网站的文本内容、粗糙粒度和细粒度行业分类标签、隐私政策链接和网站摘要等信息。数据集分为训练集、验证集和测试集,其中训练集包含109476个示例,验证集包含27370个示例,测试集包含58649个示例。
The dataset includes website text content, coarse-grained and fine-grained sector labels, privacy policy URLs, and website summaries. It is divided into training, validation, and test sets, with 109,476 examples in the training set, 27,370 in the validation set, and 58,649 in the test set.
提供机构:
Shahriar



