five

A dataset of late 1990s and early 2000s web banner ads on Chinese- and English-language web pages

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8408538
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains information about 22,915 unique banner ad images appearing on Chinese- and English-language web pages in the late 1990s and early 2000s. The dataset is mined from 1,384,355 archived web page snapshots downloaded from the Wayback Machine, representing 77,747 unique HTTP URLs. The URLs are collected from six printed Internet directory books published in mainland China and the United States between 1999 and 2001, as part of a larger research project on Chinese-language web archiving. For each banner ad image, the dataset provides standard image metadata such as file format and dimension. The dataset also provides the original URLs of the web pages where the banner ad image was found, timestamps of the archived web page snapshots containing the image, archived URLs of the image file, and, if available, archived URLs of web pages to which the ad image is linked. Additionally, the dataset provides text data obtained from the banner ad images using optical character recognition (OCR). We expect the dataset to be useful for researchers across a variety of disciplines and fields such as visual culture, history, media studies, and business and marketing.
创建时间:
2023-11-24
二维码
社区交流群
二维码
科研交流群
商业服务