ilee0022/Caltech-256

Name: ilee0022/Caltech-256
Creator: ilee0022
Published: 2024-04-20 04:26:13
License: 暂无描述

Hugging Face2024-04-20 更新2024-04-21 收录

下载链接：

https://hf-mirror.com/datasets/ilee0022/Caltech-256

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: image dtype: image - name: label dtype: int64 - name: text dtype: string splits: - name: train num_bytes: 932793797.0 num_examples: 24791 - name: test num_bytes: 120168332.0 num_examples: 3061 - name: validation num_bytes: 107180687.0 num_examples: 2755 download_size: 1147593917 dataset_size: 1160142816.0 configs: - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* - split: validation path: data/validation-* --- # Dataset Card for Dataset Name  This is the huggingface format of : https://data.caltech.edu/records/nyy15-4j048. Please cite the original author of the dataset ## Dataset Details ### Dataset Description  - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] ### Dataset Sources [optional]  - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses  ### Direct Use  [More Information Needed] ### Out-of-Scope Use  [More Information Needed] ## Dataset Structure  [More Information Needed] ## Dataset Creation ### Curation Rationale  [More Information Needed] ### Source Data  #### Data Collection and Processing  [More Information Needed] #### Who are the source data producers?  [More Information Needed] ### Annotations [optional]  #### Annotation process  [More Information Needed] #### Who are the annotators?  [More Information Needed] #### Personal and Sensitive Information  [More Information Needed] ## Bias, Risks, and Limitations  [More Information Needed] ### Recommendations  Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional]  **BibTeX:** [ @misc{griffin_holub_perona_2022, title={Caltech 256}, DOI={10.22002/D1.20087}, abstractNote={We introduce a challenging set of 256 object categories containing a total of 30607 images. The original Caltech-101 was collected by choosing a set of object categories, downloading examples from Google Images and then manually screening out all images that did not fit the category. Caltech-256 is collected in a similar manner with several improvements: a) the number of categories is more than doubled, b) the minimum number of images in any category is increased from 31 to 80, c) artifacts due to image rotation are avoided and d) a new and larger clutter category is introduced for testing background rejection. We suggest several testing paradigms to measure classification performance, then benchmark the dataset using two simple metrics as well as a state-of-the-art spatial pyramid matching algorithm. Finally we use the clutter category to train an interest detector which rejects uninformative background regions.}, publisher={CaltechDATA}, author={Griffin, Gregory and Holub, Alex and Perona, Pietro}, year={2022}, month={Apr} }] **APA:** [More Information Needed] ## Glossary [optional]  [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]

--- dataset_info: 数据集信息 features: - name: 图像（image） dtype: 图像类型 - name: 标签（label） dtype: 64位整数类型 - name: 文本（text） dtype: 字符串类型 splits: - name: 训练集（train） num_bytes: 932793797.0 num_examples: 24791 - name: 测试集（test） num_bytes: 120168332.0 num_examples: 3061 - name: 验证集（validation） num_bytes: 107180687.0 num_examples: 2755 download_size: 1147593917 dataset_size: 1160142816.0 configs: - config_name: 默认配置（default） data_files: - split: train path: data/train-* - split: test path: data/test-* - split: validation path: data/validation-* --- # 数据集名称的数据集卡片  本数据集为链接 https://data.caltech.edu/records/nyy15-4j048 的Hugging Face格式版本，请引用该数据集的原作者。 ## 数据集详情 ### 数据集描述  - **整理方**：[需补充更多信息] - **资助方（可选）**：[需补充更多信息] - **共享方（可选）**：[需补充更多信息] - **自然语言处理所用语言（可选）**：[需补充更多信息] - **许可证**：[需补充更多信息] ### 数据集来源（可选）  - **代码仓库**：[需补充更多信息] - **相关论文（可选）**：[需补充更多信息] - **演示示例（可选）**：[需补充更多信息] ## 数据集用途  ### 直接使用场景  [需补充更多信息] ### 超出适用范围的使用场景  [需补充更多信息] ## 数据集结构  [需补充更多信息] ## 数据集构建 ### 构建动机  [需补充更多信息] ### 源数据  #### 数据收集与处理流程  [需补充更多信息] #### 源数据生产者  [需补充更多信息] ### 标注信息（可选）  #### 标注流程  [需补充更多信息] #### 标注人员  [需补充更多信息] #### 个人与敏感信息  [需补充更多信息] ## 偏差、风险与局限性  [需补充更多信息] ### 建议  用户应知晓该数据集存在的风险、偏差与局限性，需补充更多信息以形成完善的使用建议。 ## 引用信息（可选）  **BibTeX格式**： @misc{griffin_holub_perona_2022, title={"Caltech 256"}, DOI={"10.22002/D1.20087"}, abstractNote={我们提出了一个包含256个物体类别的具有挑战性的数据集，总计30607张图像。最初的Caltech-101数据集通过选定一组物体类别，从谷歌图片下载示例，随后手动筛选掉不符合类别的图像而构建。Caltech-256以类似方式收集并进行了多项改进：a) 类别数量翻倍有余；b) 每个类别的最小图像数量从31提升至80；c) 避免了图像旋转带来的伪影；d) 新增了一个更大的杂类类别以测试背景剔除能力。我们提出了多种测试范式以评估分类性能，并使用两种简单指标以及最先进的空间金字塔匹配（Spatial Pyramid Matching）算法对该数据集进行了基准测试。最后我们利用该杂类类别训练了一个兴趣检测器，可剔除无信息的背景区域。}, publisher={CaltechDATA}, author={Griffin, Gregory and Holub, Alex and Perona, Pietro}, year={2022}, month={Apr} } **APA格式**：[需补充更多信息] ## 术语表（可选）  [需补充更多信息] ## 更多信息（可选） [需补充更多信息] ## 数据集卡片作者（可选） [需补充更多信息] ## 数据集卡片联系人 [需补充更多信息]

提供机构：

ilee0022

原始信息汇总

数据集概述

数据集特征

image: 图像数据类型
label: 整数数据类型（int64）
text: 字符串数据类型

数据集分割

训练集: 包含24791个样本，总大小为932793797字节
测试集: 包含3061个样本，总大小为120168332字节
验证集: 包含2755个样本，总大小为107180687字节

数据集大小

下载大小: 1147593917字节
数据集总大小: 1160142816字节

配置文件

默认配置: 包含训练、测试和验证数据的路径配置
- 训练数据路径: data/train-*
- 测试数据路径: data/test-*
- 验证数据路径: data/validation-*

5,000+

优质数据集

54 个

任务类型

进入经典数据集