five

Elgold intermediate: raw texts

收藏
DataCite Commons2026-04-30 更新2024-07-13 收录
下载链接:
https://mostwiedzy.pl/en/open-research-data/elgold-intermediate-raw-texts,628102859659161-0
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains raw texts scrapped from various internet sources which were used for creating the Elgold dataset. The texts were collected from 7 main categories: "News", "Job offers", "Movie reviews", "Automotive blogs", "Amazon product reviews", "Scientific papers abstracts", and "Historic blogs". The Scientific Papers category was additionally divided into five subcategories: "Biomedicine", "Life Sciences", "Mathematics", "Medicine & Public Health", and "Science, Humanities and Social Sciences, multidisciplinary".  The raw texts were collected from publicly available Internet sources by the group of 14 participants. Every category has 2-3 participants assigned. The dataset consists of approximately 100 texts for each category (and subcategory in the case of "Scientific papers abstracts").
提供机构:
Gdańsk University of Technology
创建时间:
2024-06-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作