Corel图像特征数据集,包含68040张不同类别的照片
收藏帕依提提2024-03-04 收录
下载链接:
https://www.payititi.com/opendatasets/show-26011.html
下载链接
链接失效反馈官方服务:
资源简介:
Data Set Information: The original image collection was obtained from Corel at [Web link]. There are 68,040 photo images from various categories. Each set of features is stored in a separate file. For each file, a line corresponds to a single image. The first value in a line is is the image ID and the subsequent values are the feature vector (e.g. color histogram, etc.) of the image. The same image has the same ID in all files but the image ID is not the same as the image filename. Attribute Information: From each image four sets of features were extracted: - Color Histogram - Color Histogram Layout - Color Moments - Co-occurrence Texture Color Histogram: 32 dimensions (8 x 4 = H x S) - HSV color space is divided into 32 subspaces (32 colors : 8 ranges of H and 4 ranges of S). - the value in each dimension in a ColorHistogram of an image is the density of each color in the entire image. - Histogram intersection (overlap area between ColorHistograms of two images) can be used to measure the similarity between two images. Color Histogram Layout: 32 dimensions (4 x 2 x 4 = H x S x sub-images) - each image is divided into 4 sub-images (one horizontal split and one vertical split). - 4x2 Color Histogram for each sub-image is computed. - Histogram Intersection can be used to measure the similarity between two images. Color Moments: 9 dimensions (3 x 3) - the 9 values are: (one for each of H,S, and V in HSV color space) -- mean, -- standard deviation, and -- skewness. - Euclidean distance between Color Moments of two images can be used to represent the dis-similarity (distance) between two images. Co-occurrence Texture: 16 dimensions (4 x 4) - images are converted to 16 gray-scale images. - co-ocurrence in 4 directions is computed (horizontal, vertical, and two diagonal directions). the 16 values are: (one for each direction). -- Second Angular Moment, -- Contrast, I -- nverse Difference Moment, and -- Entropy. -Euclidean distance between ColorMoments of two images can be used to measure the dis-similarity (distance) between two images. Relevant Papers: Michael Ortega, Yong Rui, Kaushik Chakrabarti, Kriengkrai Porkaew, Sharad Mehrotra, and Thomas S. Huang, Supporting Ranked Boolean Similarity Queries in MARS, IEEE Transaction on Knowledge and Data Engineering, Vol. 10, No. 6, Pages 905-925, December 1998. [Web link] Kaushik Chakrabarti, and Sharad Mehrotra, The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces, 1999 IEEE International Conference on Data Engineering (ICDE), Pages 440-447, February, 1999. [Web link] Kriengkrai Porkaew, Kaushik Chakrabarti, and Sharad Mehrotra, Query Refinement for Multimedia Retrieval and its evaluation Techniques in MARS, 1999 ACM International Multimedia Conference, Orlando, Florida, Oct 30 - Nov 4, 1999. [Web link] Kaushik Chakrabarti, Kriengkrai Porkaew, and Sharad Mehrotra, Efficient Query Refinement in Multimedia Databases, ICDE, 2000 [Web link] Papers That Cite This Data Set1: Thomas T. Osugi and M. S. EXPLORATION-baseD ACTIVE MACHINE LEARNING. Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements. [View Context]. Citation Request: This data may be used for non-commercial purposes only. Original Owner: Michael Ortega-Binderberger Information and Computer Science University of California at Irvine Irvine, CA 92697-3425 USA miki '@' ics.uci.edu Donor: Kriengkrai Porkaew and Sharad Mehrotra Information and Computer Science University of California at Irvine Irvine, CA 92697-3425 USA nid '@' ics.uci.edu,sharad '@' ics.uci.edu
数据集信息:原始图像集从Corel获取[网络链接],包含来自不同类别的68040张照片图像。每组特征均存储于独立文件中,文件内每一行对应单张图像:行首为图像ID,后续数值为该图像的特征向量(如颜色直方图(Color Histogram)等)。所有文件中同一张图像的ID一致,但图像ID与图像文件名并不相同。
属性信息:从每张图像中提取了四组特征:
- 颜色直方图(Color Histogram)
- 颜色直方图布局(Color Histogram Layout)
- 颜色矩(Color Moments)
- 共生纹理(Co-occurrence Texture)
颜色直方图(Color Histogram):32维(8×4=色调H×饱和度S)
HSV色彩空间被划分为32个子空间(32种颜色:8个色调区间与4个饱和度区间)。图像颜色直方图的每一维数值代表该颜色在整幅图像中的像素密度。可使用直方图交集(两幅图像颜色直方图的重叠区域)衡量两张图像间的相似度。
颜色直方图布局(Color Histogram Layout):32维(4×2×4=H×S×子图像数)
将每张图像划分为4个子图像(1次水平分割与1次垂直分割),为每个子图像计算4×2的颜色直方图,可通过直方图交集衡量两张图像间的相似度。
颜色矩(Color Moments):9维(3×3)
9个数值分别对应HSV色彩空间的H、S、V三个通道的均值、标准差与偏度。可使用两张图像颜色矩之间的欧氏距离(Euclidean distance)表征二者的不相似度(距离)。
共生纹理(Co-occurrence Texture):16维(4×4)
将图像转换为16幅灰度图像,计算4个方向(水平、垂直与两个对角方向)上的共生特征,16个数值分别对应4个方向的以下4项指标:二阶角矩(Second Angular Moment)、对比度(Contrast)、逆差分矩(Inverse Difference Moment)与熵(Entropy)。可使用两张图像共生纹理特征间的欧氏距离衡量二者的不相似度(距离)。
相关论文:
1. Michael Ortega、Yong Rui、Kaushik Chakrabarti、Kriengkrai Porkaew、Sharad Mehrotra与Thomas S. Huang,《支持MARS中的排序布尔相似度查询》,IEEE知识与数据工程汇刊,第10卷第6期,第905-925页,1998年12月。[网络链接]
2. Kaushik Chakrabarti与Sharad Mehrotra,《混合树:面向高维特征空间的索引结构》,1999年IEEE国际数据工程会议(ICDE 1999),第440-447页,1999年2月。[网络链接]
3. Kriengkrai Porkaew、Kaushik Chakrabarti与Sharad Mehrotra,《多媒体检索的查询优化及其在MARS中的评估技术》,1999年ACM国际多媒体会议,美国佛罗里达州奥兰多,1999年10月30日-11月4日。[网络链接]
4. Kaushik Chakrabarti、Kriengkrai Porkaew与Sharad Mehrotra,《多媒体数据库中的高效查询优化》,ICDE 2000。[网络链接]
引用本数据集的论文:
Thomas T. Osugi与M. S.,《基于探索的主动机器学习》,内布拉斯加大学研究生院,作为满足学位要求的部分成果。[查看上下文]
使用要求:本数据集仅可用于非商业用途。
原始所有者:Michael Ortega-Binderberger
加利福尼亚大学欧文分校信息与计算机科学系,美国加利福尼亚州欧文市92697-3425
邮箱:miki '@' ics.uci.edu
捐赠者:Kriengkrai Porkaew与Sharad Mehrotra
加利福尼亚大学欧文分校信息与计算机科学系,美国加利福尼亚州欧文市92697-3425
邮箱:nid '@' ics.uci.edu、sharad '@' ics.uci.edu
提供机构:
帕依提提



