five

BuzzCity mobile advertisement dataset

收藏
Figshare2014-01-01 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/BuzzCity_mobile_advertisement_dataset/12062703
下载链接
链接失效反馈
官方服务:
资源简介:
This competition involves advertisement data provided by BuzzCity Pte. Ltd. BuzzCity is a global mobile advertising network that has millions of consumers around the world on mobile phones and devices. In Q1 2012, over 45 billion ad banners were delivered across the BuzzCity network consisting of more than 10,000 publisher sites which reach an average of over 300 million unique users per month. The number of smartphones active on the network has also grown significantly. Smartphones now account for more than 32% phones that are served advertisements across the BuzzCity network. The "raw" data used in this competition has two types: publisher database and click database, both provided in CSV format. The publisher database records the publisher's (aka partner's) profile and comprises several fields: publisherid - Unique identifier of a publisher. Bankaccount - Bank account associated with a publisher (may be empty) address - Mailing address of a publisher (obfuscated; may be empty) status - Label of a publisher, which can be the following: "OK" - Publishers whom BuzzCity deems as having healthy traffic (or those who slipped their detection mechanisms) "Observation" - Publishers who may have just started their traffic or their traffic statistics deviates from system wide average. BuzzCity does not have any conclusive stand with these publishers yet "Fraud" - Publishers who are deemed as fraudulent with clear proof. Buzzcity suspends their accounts and their earnings will not be paid On the other hand, the click database records the click traffics and has several fields: id - Unique identifier of a particular click numericip - Public IP address of a clicker/visitor deviceua - Phone model used by a clicker/visitor publisherid - Unique identifier of a publisher adscampaignid - Unique identifier of a given advertisement campaign usercountry - Country from which the surfer is clicktime - Timestamp of a given click (in YYYY-MM-DD format) publisherchannel - Publisher's channel type, which can be the following: ad - Adult sites co - Community es - Entertainment and lifestyle gd - Glamour and dating in - Information mc - Mobile content pp - Premium portal se - Search, portal, services referredurl - URL where the ad banners were clicked (obfuscated; may be empty). More details about the HTTP Referer protocol can be found in this article. Related Publication: R. J. Oentaryo, E.-P. Lim, M. Finegold, D. Lo, F.-D. Zhu, C. Phua, E.-Y. Cheu, G.-E. Yap, K. Sim, M. N. Nguyen, K. Perera, B. Neupane, M. Faisal, Z.-Y. Aung, W. L. Woon, W. Chen, D. Patel, and D. Berrar. (2014). Detecting click fraud in online advertising: A data mining approach, Journal of Machine Learning Research, 15, 99-140.

本竞赛使用由BuzzCity Pte. Ltd.提供的广告数据集。BuzzCity是一家全球移动广告网络,在全球范围内拥有数以百万计的移动设备用户群体。2012年第一季度,BuzzCity广告网络(覆盖超1万家发布商站点,月均触达超3亿独立用户)共投放超450亿条广告横幅。该网络的活跃智能手机设备数量亦实现显著增长,目前通过BuzzCity网络接收广告的设备中,智能手机占比已超32%。 本次竞赛所用的原始数据集包含两类:发布商数据库与点击数据库,二者均采用CSV格式存储。 发布商数据库用于记录发布商(亦称合作伙伴)的档案信息,包含以下字段: - publisherid:发布商的唯一标识符 - Bankaccount:绑定至发布商的银行账户信息(可为空) - address:发布商的邮寄地址(已做混淆处理,可为空) - status:发布商状态标签,可选值如下: - "OK":BuzzCity判定流量健康的发布商(或成功绕过其检测机制的发布商) - "Observation":刚开展流量业务或流量统计数据偏离全网均值的发布商,BuzzCity暂未对其形成定论 - "Fraud":经明确证据判定存在作弊行为的发布商,BuzzCity将暂停其账户且不予支付其收益 另一方面,点击数据库用于记录点击流量数据,包含以下字段: - id:单次点击的唯一标识符 - numericip:点击者/访客设备的公网IP地址 - ua:点击者/访客所用的手机型号 - publisherid:发布商的唯一标识符 - adscampaignid:特定广告活动的唯一标识符 - usercountry:访客所在国家 - clicktime:单次点击的时间戳(格式为YYYY-MM-DD) - publisherchannel:发布商的渠道类型,可选值如下: - ad:成人站点 - co:社区站点 - es:娱乐与生活方式站点 - gd:时尚与交友站点 - in:资讯站点 - mc:移动内容站点 - pp:高端门户站点 - se:搜索、门户与服务站点 - referredurl:广告横幅被点击时所在的来源URL(已做混淆处理,可为空)。有关HTTP Referer协议的更多详情可参阅相关文章。 相关研究文献: R. J. Oentaryo、E.-P. Lim、M. Finegold、D. Lo、F.-D. Zhu、C. Phua、E.-Y. Cheu、G.-E. Yap、K. Sim、M. N. Nguyen、K. Perera、B. Neupane、M. Faisal、Z.-Y. Aung、W. L. Woon、W. Chen、D. Patel 与 D. Berrar(2014):《在线广告中的点击作弊检测:一种数据挖掘方法》,发表于《Journal of Machine Learning Research》(《机器学习研究期刊》),第15卷,第99-140页。
创建时间:
2014-01-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作