liu-nlp/temp_german_data_subset
收藏Hugging Face2025-08-24 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/liu-nlp/temp_german_data_subset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个主要字段:文本(text)和唯一标识符(id)。文本字段包含字符串数据,可能是某种文本内容。数据集被划分为训练集(train),共有6000万个样本,总大小为约203.49GB。数据集的下载大小约为128.19GB。由于README中没有提供详细的数据集内容描述,我们无法确定具体的文本内容类型或来源。
The dataset includes two main fields: text and unique identifier (id). The text field contains string data, which may be some form of textual content. The dataset is split into a training set (train) with a total of 60 million samples, with a total size of approximately 203.49GB. The download size of the dataset is about 128.19GB. Since the README does not provide a detailed description of the dataset content, we cannot determine the specific type or source of the textual content.
提供机构:
liu-nlp



