five

Mannheimer Webpanel

收藏
DataCite Commons2024-06-24 更新2024-07-13 收录
下载链接:
https://kooperationen.zew.de/zew-fdz/datenangebot/mannheimer-webpanel
下载链接
链接失效反馈
官方服务:
资源简介:
The ZEW-FDZ offers a novel panel of semi-structured webpage data on company level – the Mannheimer Webpanel. It comprises textual webpage data retrieved from a broad range of German firm websites. A detailed description of the webscraping methods used to harvest the data as well as an examination of the dataset (corpus of German corporate websites) can be found in this discussion paper. The dataset provides, among others, the following variables:<br>- ID – unique company identifier.<br>- dl_rank – usually a company website consists of several single webpages. In this context, dl_rank represents the chronological order in which the individual webpages were downloaded. The main page of a website has rank 0, the first subpage processed after the main page has rank 1, and so on.<br>- dl_slot – the domain name of the website.<br>- title – the title of the company website as indicated in the website's meta data.<br>- keywords – list of keywords of the company website as indicated in the website's meta data.<br>- description – the description of the company website as indicated in the website's meta data.<br>- text – the text/content that was downloaded from the webpage.<br>- timestamp – the exact time when the webpage was downloaded.<br>- url – the URL of the webpage.
提供机构:
ZEW – Leibniz Centre for European Economic Research
创建时间:
2021-12-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作