tobiashomie/finest_dataset_2025_02_04_18
收藏Hugging Face2025-02-04 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/tobiashomie/finest_dataset_2025_02_04_18
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本数据和相关元信息,文本数据通过text字段表示,每个文本都有一个唯一的id标识。元信息包括日期、文件路径、语言、语言置信度、词数和URL等。数据集分为训练集,共有14842个样本,总大小约为57MB。
The dataset contains text data and related metadata, with the text represented by the text field and each text having a unique id. The metadata includes date, file path, language, language confidence, token count, and URL, etc. The dataset is split into a training set with a total of 14,842 samples, totaling approximately 57MB in size.
提供机构:
tobiashomie



