SherLock
收藏www.kaggle.com2019-11-18 更新2025-01-15 收录
下载链接:
https://www.kaggle.com/BGU-CSRC/sherlock
下载链接
链接失效反馈官方服务:
资源简介:
# What is the SherLock dataset?
A long-term smartphone sensor dataset with a high temporal resolution. The dataset also offers explicit labels capturing the to activity of malwares running on the devices. The dataset currently contains 10 billion data records from 30 users collected over a period of 2 years and an additional 20 users for 10 months (totaling 50 active users currently participating in the experiment).
The primary purpose of the dataset is to help security professionals and academic researchers in developing innovative methods of implicitly detecting malicious behavior in smartphones. Specifically, from data obtainable without superuser (root) privileges. However, this dataset can be used for research in domains that are not strictly security related. For example, context aware recommender systems, event prediction, user personalization and awareness, location prediction, and more. The dataset also offers opportunities that aren't available in other datasets. For example, the dataset contains the SSID and signal strength of the connected WiFi access point (AP) which is sampled once every second, over the course of many months.
To gain full free access to the SherLock Dataset, follow these two steps:
1) Read, complete and sign the license agreement. The general restrictions are:
-The license lasts for 3 years, afterwhich the data must be deleted.
-Do not share the data with those who are not bound by the license agreement.
-Do not attempt to de-anonymize the individuals (volunteers) who have contributed the data.
-Any of your publication that benefit from the SherLock project must cite the following article:
*Mirsky, Yisroel, et al. "SherLock vs Moriarty: A Smartphone Dataset for Cybersecurity Research." Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security. ACM, 2016.*
2)Send the scanned document as a PDF to bgu.sherlock@gmail.com and provide a gmail account to share a google drive folder with.
More information can be found [here][1], or in [this][2] publication ([download link][3]).
**A 2 week data sample from a single user is provided on this Kaggle page. To access the full dataset for free, please [visit our site][1].** Note: The format of the sample dataset may differ from the full dataset.
[1]: http://bigdata.ise.bgu.ac.il/sherlock/
[2]: http://dl.acm.org/citation.cfm?id=2996764
[3]: https://drive.google.com/file/d/0B_A1qX1kf7R9OVhhVk5wNjkydkU/view?usp=sharing
何为SherLock数据集?SherLock数据集是一份长期的高时间分辨率的智能手机传感器数据集。该数据集还提供了显式标签,用以捕捉设备上运行的恶意软件的活动。目前,该数据集包含来自30名用户在两年内收集的100亿条数据记录,以及额外20名用户在10个月内的数据(目前共有50名活跃用户参与实验)。数据集的主要目的是协助安全专业人士和学术研究人员开发检测智能手机中恶意行为的新颖方法。具体而言,这些方法可以从无需超级用户(root)权限的数据中获取。然而,该数据集亦可用于非严格安全相关的领域研究,例如情境感知推荐系统、事件预测、用户个性化与意识、位置预测等。SherLock数据集还提供了其他数据集中难以获得的机会,例如,数据集中包含了连接的WiFi接入点(AP)的SSID和信号强度,这些数据每秒钟采样一次,持续数月之久。
要完全免费地访问SherLock数据集,请遵循以下两个步骤:
1) 阅读并完成许可协议,并签署。一般限制条件如下:
- 许可期限为3年,期满后必须删除数据。
- 不要将数据与未签署许可协议的个人共享。
- 不要尝试识别(去匿名化)提供数据的个人(志愿者)。
- 从SherLock项目中受益的任何出版物都必须引用以下文章:
*Mirsky, Yisroel, 等人. "SherLock vs Moriarty: A Smartphone Dataset for Cybersecurity Research." 在2016年ACM人工智能与安全研讨会上的论文。ACM,2016。
2) 将扫描后的文件作为PDF发送至bgu.sherlock@gmail.com,并提供一个Gmail账户以便共享Google Drive文件夹。
更多详细信息可在[此处][1]找到,或在[该文献][2]中查看([下载链接][3])。
**该Kaggle页面提供了一个来自单个用户的两周数据样本。要免费获取完整数据集,请[访问我们的网站][1]。** 注意:样本数据集的格式可能与完整数据集不同。
[1]: http://bigdata.ise.bgu.ac.il/sherlock/
[2]: http://dl.acm.org/citation.cfm?id=2996764
[3]: https://drive.google.com/file/d/0B_A1qX1kf7R9OVhhVk5wNjkydkU/view?usp=sharing
提供机构:
Kaggle
搜集汇总
数据集介绍

背景与挑战
背景概述
SherLock是一个长期智能手机传感器数据集,包含50名用户2年内的100亿条高分辨率数据记录,特别标注了恶意软件活动。该数据集主要用于开发无需root权限的恶意行为检测方法,同时也适用于上下文感知推荐系统、用户个性化等多种研究领域。
以上内容由遇见数据集搜集并总结生成



