mlfoundations-dev/textclassifier_train_on_pdfs__v2
收藏Hugging Face2025-02-17 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/mlfoundations-dev/textclassifier_train_on_pdfs__v2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了网页URL、PDF文件路径、页面编号和文本内容等信息。它被设计用于文本处理任务,并为训练集提供了必要的文件路径和大小信息。
The dataset includes web page URLs, PDF file paths, page numbers, and text content. It is designed for text processing tasks and provides necessary file paths and size information for the training set.
提供机构:
mlfoundations-dev



