File Fragment Type (FFT) - 75 Dataset
收藏IEEE2019-08-07 更新2026-04-17 收录
下载链接:
https://ieee-dataport.org/open-access/file-fragment-type-fft-75-dataset
下载链接
链接失效反馈官方服务:
资源简介:
This FFT-75 dataset contains randomly sampled, potentially overlapping file fragments from 75 popular file types (see details below). It is the most diverse and balanced dataset available to the best of our knowledge. The dataset is labeled with class IDs and is ready for training supervised machine learning models. We distinguish 6 different scenarios with different granularity and provide variants with 512 and 4096-byte blocks. In each case, we sampled a balanced dataset and split the data as follows: 80% for training, 10% for testing and 10% for validation.
本FFT-75数据集包含从75种主流文件类型中随机采样得到的、可能存在重叠的文件片段(详见下文)。据我们所知,该数据集是当前可用的多样性与均衡性最优的数据集。本数据集已标注类别ID(class ID),可直接用于监督机器学习模型的训练。我们设置了6种不同粒度的实验场景,并提供了块大小为512字节与4096字节的数据集变体。针对每种场景,我们均构建了均衡采样的数据集,并按如下比例拆分:80%作为训练集,10%作为测试集,10%作为验证集。
提供机构:
New York University
创建时间:
2019-08-07



