nyuuzyou/edutexts
收藏Hugging Face2024-12-12 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/nyuuzyou/edutexts
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含从俄语和乌克兰语的教育演示文稿和文档中提取的文本。它包括来自100,460个演示文稿和29,121个文档的文本,涵盖了各种学术主题。数据集主要使用俄语(ru),部分内容为乌克兰语(uk)。数据集结构包括两个部分:演示文稿和文档。演示文稿部分包含每个演示文稿的标题和每张幻灯片的文本内容;文档部分包含文档的标题和主要内容文本。数据集有两个分割:演示文稿(100,460个实例)和文档(29,121个实例)。该数据集在Creative Commons Zero (CC0)许可证下发布,允许用户自由使用、修改和分发,无需署名。
This dataset contains text extracted from educational presentations and documents in Russian and Ukrainian languages. It includes text from 100,460 presentations and 29,121 documents covering various academic subjects. The dataset is primarily in Russian (ru) with some Ukrainian (uk) content. The dataset structure includes two parts: presentations and documents. The presentations part contains the title of each presentation and the text content from each slide; the documents part contains the title of the document and the main content text. The dataset has two splits: presentations (100,460 instances) and documents (29,121 instances). The dataset is released under the Creative Commons Zero (CC0) license, allowing users to freely use, modify, and distribute it without attribution.
提供机构:
nyuuzyou



