five

BuzzCity mobile advertisement dataset

收藏
Mendeley Data2024-01-31 更新2024-06-28 收录
下载链接:
https://researchdata.smu.edu.sg/articles/dataset/BuzzCity_mobile_advertisement_dataset/12062703
下载链接
链接失效反馈
官方服务:
资源简介:
This competition involves advertisement data provided by BuzzCity Pte. Ltd. BuzzCity is a global mobile advertising network that has millions of consumers around the world on mobile phones and devices. In Q1 2012, over 45 billion ad banners were delivered across the BuzzCity network consisting of more than 10,000 publisher sites which reach an average of over 300 million unique users per month. The number of smartphones active on the network has also grown significantly. Smartphones now account for more than 32% phones that are served advertisements across the BuzzCity network. The "raw" data used in this competition has two types: publisher database and click database, both provided in CSV format. The publisher database records the publisher's (aka partner's) profile and comprises several fields: publisherid - Unique identifier of a publisher. Bankaccount - Bank account associated with a publisher (may be empty) address - Mailing address of a publisher (obfuscated; may be empty) status - Label of a publisher, which can be the following: "OK" - Publishers whom BuzzCity deems as having healthy traffic (or those who slipped their detection mechanisms) "Observation" - Publishers who may have just started their traffic or their traffic statistics deviates from system wide average. BuzzCity does not have any conclusive stand with these publishers yet "Fraud" - Publishers who are deemed as fraudulent with clear proof. Buzzcity suspends their accounts and their earnings will not be paid On the other hand, the click database records the click traffics and has several fields: id - Unique identifier of a particular click numericip - Public IP address of a clicker/visitor deviceua - Phone model used by a clicker/visitor publisherid - Unique identifier of a publisher adscampaignid - Unique identifier of a given advertisement campaign usercountry - Country from which the surfer is clicktime - Timestamp of a given click (in YYYY-MM-DD format) publisherchannel - Publisher's channel type, which can be the following: ad - Adult sites co - Community es - Entertainment and lifestyle gd - Glamour and dating in - Information mc - Mobile content pp - Premium portal se - Search, portal, services referredurl - URL where the ad banners were clicked (obfuscated; may be empty). More details about the HTTP Referer protocol can be found in this article. Related Publication: R. J. Oentaryo, E.-P. Lim, M. Finegold, D. Lo, F.-D. Zhu, C. Phua, E.-Y. Cheu, G.-E. Yap, K. Sim, M. N. Nguyen, K. Perera, B. Neupane, M. Faisal, Z.-Y. Aung, W. L. Woon, W. Chen, D. Patel, and D. Berrar. (2014). Detecting click fraud in online advertising: A data mining approach, Journal of Machine Learning Research, 15, 99-140.

本次竞赛所用广告数据集由BuzzCity Pte. Ltd.提供。BuzzCity是一家全球移动广告网络服务商,其覆盖全球范围内数以百万计的移动设备用户。 2012年第一季度,BuzzCity广告网络累计投放广告横幅超450亿次,该网络覆盖超过10000家发布商站点,月均独立用户量超3亿。该网络活跃智能手机设备的数量亦实现显著增长,目前在BuzzCity网络投放广告覆盖的手机设备中,智能手机占比已超32%。 本次竞赛所用"原始"数据集包含两类文件:发布商数据库与点击数据库,二者均采用CSV格式。 发布商数据库用于记录发布商(亦称合作方)的档案信息,包含以下字段: - publisherid:发布商唯一标识符 - Bankaccount:关联至发布商的银行账户信息(可为空) - address:发布商的邮寄地址(已做混淆处理,可为空) - status:发布商状态标签,可选取值如下: - "OK":BuzzCity判定流量健康的发布商(或成功绕过其检测机制的发布商) - "Observation":刚开启流量业务或流量统计数据偏离系统整体均值的发布商,BuzzCity目前尚未对该类发布商作出最终判定 - "Fraud":经明确证据判定存在欺诈行为的发布商,BuzzCity将暂停其账户且不予结算收益 另一方面,点击数据库用于记录点击流量信息,包含以下字段: - id:单次点击的唯一标识符 - numericip:点击者/访问者设备的公网IP地址 - ua:点击者/访问者所用的手机型号 - publisherid:发布商唯一标识符 - adscampaignid:特定广告活动的唯一标识符 - usercountry:浏览者所在国家 - clicktime:单次点击的时间戳(格式为YYYY-MM-DD) - publisherchannel:发布商渠道类型,可选取值如下: - ad:成人内容站点 - co:社区站点 - es:娱乐与生活方式站点 - gd:时尚魅力与交友站点 - in:资讯站点 - mc:移动内容站点 - pp:高端门户站点 - se:搜索、门户与服务站点 - referredurl:广告横幅被点击时所在的来源URL(已做混淆处理,可为空)。有关HTTP Referer协议的更多详情可参阅相关文章。 相关文献: R. J. Oentaryo、E.-P. Lim、M. Finegold、D. Lo、F.-D. Zhu、C. Phua、E.-Y. Cheu、G.-E. Yap、K. Sim、M. N. Nguyen、K. Perera、B. Neupane、M. Faisal、Z.-Y. Aung、W. L. Woon、W. Chen、D. Patel 与 D. Berrar. (2014). 在线广告点击欺诈检测:一种数据挖掘方法,《机器学习研究期刊》,第15卷,第99-140页。
创建时间:
2024-01-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作