five

nyuuzyou/edutexts

收藏
Hugging Face2024-12-12 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/nyuuzyou/edutexts
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含从俄语和乌克兰语的教育演示文稿和文档中提取的文本。它包括来自100,460个演示文稿和29,121个文档的文本,涵盖了各种学术主题。数据集主要使用俄语(ru),部分内容为乌克兰语(uk)。数据集结构包括两个部分:演示文稿和文档。演示文稿部分包含每个演示文稿的标题和每张幻灯片的文本内容;文档部分包含文档的标题和主要内容文本。数据集有两个分割:演示文稿(100,460个实例)和文档(29,121个实例)。该数据集在Creative Commons Zero (CC0)许可证下发布,允许用户自由使用、修改和分发,无需署名。

This dataset contains text extracted from educational presentations and documents in Russian and Ukrainian languages. It includes text from 100,460 presentations and 29,121 documents covering various academic subjects. The dataset is primarily in Russian (ru) with some Ukrainian (uk) content. The dataset structure includes two parts: presentations and documents. The presentations part contains the title of each presentation and the text content from each slide; the documents part contains the title of the document and the main content text. The dataset has two splits: presentations (100,460 instances) and documents (29,121 instances). The dataset is released under the Creative Commons Zero (CC0) license, allowing users to freely use, modify, and distribute it without attribution.
提供机构:
nyuuzyou
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作