Abhilashvj/CIRCL_website_subset
收藏Hugging Face2023-05-28 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Abhilashvj/CIRCL_website_subset
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: image
dtype: image
- name: label
dtype:
class_label:
names:
'0': forum
'1': general
'2': marketplace
splits:
- name: train
num_bytes: 2109417862.525
num_examples: 3005
- name: test
num_bytes: 59369011.0
num_examples: 81
download_size: 1946901450
dataset_size: 2168786873.525
---
# Dataset Card for Dataset Name
## Dataset Description
- **Homepage:** https://www.circl.lu/opendata/datasets/circl-ail-dataset-01/
- **Repository:**
- **Paper:**
- **Leaderboard:**
- **Point of Contact:** @Electronic{CIRCL-AILDS2019, author = {Vincent Falconieri}, month = {07}, year = {2019}, title = {CIRCL Images AIL Dataset}, organization = {CIRCL}, address = {CIRCL - Computer Incident Response Center Luxembourg c/o "security made in Lëtzebuerg" (SMILE) g.i.e. 122, rue Adolphe Fischer L-1521 Luxembourg Grand-Duchy of Luxembourg}, url = {https://www.circl.lu/opendata/circl-ail-dataset-01/}, abstract = {This dataset is named circl-ail-dataset-01 and is composed of Tor hidden services websites screenshots. Around 37000+ pictures are in this dataset to date.}, }
### Dataset Summary
---
task_categories:
- image-classification
pretty_name: Subset of circl-ail-dataset-01
size_categories:
- 1K<n<10K
---
This is a subset of circl-ail-dataset-01 dataset with these labels ["marketplace","forum","general"] each label has 1000 images
circl-ail-dataset-01
This dataset is named circl-ail-dataset-01 and is composed of AIL’s scraped onion websites. Around 37500 pictures are in this dataset to date.
Only one label-classification (DataTurks direct output) is provided along with the dataset. This classification is per part and will be improved and updated as soon as classification operations had been achieved.
Direct link : https://www.circl.lu/opendata/datasets/circl-ail-dataset-01/
### Supported Tasks and Leaderboards
[More Information Needed]
### Languages
[More Information Needed]
## Dataset Structure
### Data Instances
[More Information Needed]
### Data Fields
[More Information Needed]
### Data Splits
[More Information Needed]
## Dataset Creation
### Curation Rationale
[More Information Needed]
### Source Data
https://www.circl.lu/opendata/datasets/circl-ail-dataset-01/
#### Initial Data Collection and Normalization
[More Information Needed]
#### Who are the source language producers?
[More Information Needed]
### Annotations
#### Annotation process
[More Information Needed]
#### Who are the annotators?
[More Information Needed]
### Personal and Sensitive Information
[More Information Needed]
## Considerations for Using the Data
### Social Impact of Dataset
[More Information Needed]
### Discussion of Biases
[More Information Needed]
### Other Known Limitations
[More Information Needed]
## Additional Information
### Dataset Curators
[More Information Needed]
### Licensing Information
[More Information Needed]
### Citation Information
[More Information Needed]
### Contributions
[More Information Needed]
提供机构:
Abhilashvj
原始信息汇总
数据集概述
数据集名称
- 名称: circl-ail-dataset-01
数据集描述
- 摘要: 该数据集名为circl-ail-dataset-01,由AIL抓取的洋葱网站组成。截至目前,数据集中约有37500张图片。
数据集特征
- 特征1: image
- 数据类型: image
- 特征2: label
- 数据类型: class_label
- 类别名称:
- 0: forum
- 1: general
- 2: marketplace
数据集结构
- 分割:
- 训练集:
- 示例数量: 3005
- 字节数: 2109417862.525
- 测试集:
- 示例数量: 81
- 字节数: 59369011.0
- 训练集:
数据集大小
- 下载大小: 1946901450
- 数据集总大小: 2168786873.525
数据集任务
- 任务类别: image-classification
- 数据集别名: Subset of circl-ail-dataset-01
- 大小类别: 1K<n<10K
数据集标签
- 标签: ["marketplace","forum","general"]
- 每个标签的图像数量: 1000



