five

Annotated Privacy Policies of 100 Online Platforms

收藏
Mendeley Data2024-03-27 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/pcgvm6zh43
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains information derived from 98 annotated privacy policies of 100 online platforms.* The hypothesis behind the study was that the privacy policies do not contain information sufficient for the consumers to fully understand what personal data exactly is being collected by the platforms, and how exactly it is used. To verify this hypothesis, two annotators (working independently) read the privacy policies in search for three types of occurrences: (1) general terms describing the categories of data collected ("GenData"); (2) general terms describing the purposes for which personal data is used ("GenUse"); (3) the no-distinction structure of a privacy policy, where the document first lists the categories of data collected, and then enumerates the purposes of use, without explaining what personal data is used for what purpose. The hypothesis has been confirmed. In the analyzed sample, all the privacy policies featured at least one instance of GenData, 97 out of 98 featured at least one instance of GenUse, and 89 out of 98 documents had a no-distinction structure. The sample contains 98 privacy policies of 100* digital platforms operating in sixteen market sectors: Cloud storage, Communication, Dating, Finance, Food, Gaming, Health, Music, Shopping, Social, Sports, Transportation, Travel, Video, Work and Various. The selected companies' headquarters span four legal surroundings: the US, the EU, Poland specifically, and Other jurisdictions. The chosen platforms are both privately held and publicly listed, and offer both fee-based and free services. The dataset consists of: (a) two spreadsheets: "PP_table Tagger1.xlsx" and PP_table Tagger2.xlsx," each containing the evaluative variables ascribed, and examples of clauses based on which the judgments have been made (b) two folders: "Tagger 1" and "Tagger 2," each containing 98 pdf files with the privacy policies analyzed, together with annotations made in the form of comments; (c) one text file: "Instruction," explaining the logic behind tagging. The reuse potential of the data is significant. It can be useful for empirical researchers interested in the dynamics of data collection processes of online platforms and normative scholars (like lawyers or political philosophers) interested in critiquing the status quo and proposing ideas for reforms. It can also be useful for non-academics, like governments interested in assessing the efficacy of their regulations, or businesses interested in avoiding the common pitfalls of privacy policy drafting. *(Apple and iCloud, as well as Google and YouTube, had the same privacy policy on the day of raw data collection, i.e. March 13, 2022). ACKNOWLEDGEMENT: The research leading to these results has received funding from the Norwegian Financial Mechanism 2014-2021, project no. 2020/37/K/HS5/02769, titled “Private Law of Data: Concepts, Practices, Principles & Politics.”

本数据集源自100个在线平台的98份已标注隐私政策(注:苹果与iCloud、谷歌与YouTube在2022年3月13日原始数据采集当日使用了相同的隐私政策)。 本研究的核心假设为:现有隐私政策未能提供足够信息,使消费者充分知晓平台具体收集哪些个人数据,以及这些数据的具体使用方式。 为验证该假设,两名独立开展工作的标注人员对隐私政策进行检索,寻找三类内容:(1) 描述所收集数据类别的通用术语(GenData);(2) 描述个人数据使用目的的通用术语(GenUse);(3) 隐私政策的无区分结构:即文档先罗列所收集的数据类别,再枚举使用目的,但未说明何种个人数据用于何种用途。 该假设已得到验证。在分析的样本中,所有隐私政策均至少包含一处GenData实例,98份中的97份至少包含一处GenUse实例,89份隐私政策采用了无区分结构。 本样本涵盖16个市场领域的100个数字平台的98份隐私政策,涉及领域包括:云存储、通讯、约会、金融、食品、游戏、健康、音乐、购物、社交、体育、交通、旅游、视频、办公及其他领域。 入选企业的总部遍布四类法律管辖区域:美国、欧盟、波兰及其他司法管辖区。入选平台既有私有企业,也有上市公司,同时提供付费服务与免费服务。 本数据集包含以下内容:(a) 两份电子表格:"PP_table Tagger1.xlsx"与"PP_table Tagger2.xlsx",每份均包含所标注的评估变量,以及作出判断所依据的条款示例;(b) 两个文件夹:"Tagger 1"与"Tagger 2",每个文件夹内均包含98份待分析隐私政策的PDF文件,以及以批注形式完成的标注;(c) 一份文本文件:"Instruction",用于说明标注背后的逻辑。 本数据集的复用潜力巨大,可用于关注在线平台数据收集过程动态的实证研究人员,以及关注批判现状并提出改革思路的规范研究学者(如律师或政治哲学家)。同时也可服务于非学术群体:例如希望评估监管政策效力的政府部门,或希望规避隐私政策起草常见误区的企业。 致谢:本研究成果的相关研究工作获得了2014-2021年挪威金融机制项目资助,项目编号2020/37/K/HS5/02769,项目名称为"Private Law of Data: Concepts, Practices, Principles & Politics"。
创建时间:
2024-01-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作