svgfind
收藏魔搭社区2025-12-04 更新2025-05-10 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/svgfind
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for SVGFind Icons
### Dataset Summary
This dataset contains a large collection of Scalable Vector Graphics (SVG) icons sourced from [SVGFind.com](https://www.svgfind.com). The icons cover a wide range of categories and styles, suitable for user interfaces, web development, presentations, and potentially for training vector graphics or icon classification models. Each icon is provided under either a Creative Commons license or is in the Public Domain, as clearly indicated in its metadata. The SVG files in this dataset have been minified using [tdewolff/minify](https://github.com/tdewolff/minify) to reduce file size while preserving visual representation, and the data files are compressed using Zstandard compression.
### Languages
The dataset metadata (titles, tags) is primarily in English:
- English (en)
## Dataset Structure
### Data Files
The dataset consists of:
- Metadata and SVG content stored in compressed JSONL format (`.jsonl.zst`) using Zstandard compression.
- Data is split into separate files based on the license of the icons:
- `svgfind-CREATIVECOMMONS.jsonl.zst`
- `svgfind-PUBLICDOMAIN.jsonl.zst`
- The SVG files in this dataset have been minified using [tdewolff/minify](https://github.com/tdewolff/minify) to reduce file size while preserving visual representation.
- Attribution details are provided in markdown files within the `ATTRIBUTION/` directory, organized by license. These attribution files are also compressed using Zstandard (`.md.zst`).
### Data Fields
Each record in the JSONL files contains the following fields:
- `id`: Unique identifier for the icon on SVGFind.com.
- `title`: Name or title of the icon.
- `data_pack`: The collection or pack the icon belongs to on SVGFind.
- `tags`: Array of strings representing tags associated with the icon.
- `license`: The specific license under which the icon is distributed (e.g., "CREATIVECOMMONS", "PUBLICDOMAIN").
- `license_owner`: The name of the entity (creator, company) specified as the license owner on SVGFind.
- `download_url`: The original URL to download the SVG file from SVGFind.com.
- `svg_content`: String containing the SVG markup for the icon.
### Data Splits
The dataset is organized into splits based on the license associated with each icon:
| Split | License Description | Number of Examples |
| :------------------ | :------------------------------------------------------- | -----------------: |
| `creativecommons` | Creative Commons licenses | 3,645,444 |
| `publicdomain` | Public Domain | 10,366 |
| **Total** | | **3,655,810** |
# License Information
## Licensing Structure
This dataset aggregates icons distributed under Creative Commons licenses and Public Domain. **Each icon in this collection has one specific license** associated with it. This license is indicated in the `license` field of the metadata and determines how the icon can be used, modified, and distributed.
## Dataset License Overview and Attribution Files
The table below shows the distribution of licenses across the icons in this dataset and the corresponding files containing detailed attribution information. The filenames reflect the actual files present in the `ATTRIBUTION/` directory.
| License | Works | Attribution File |
| :------------------------------------------------- | --------: | :------------------------------------- |
| Creative Commons (CC) | 3,645,444 | `CREATIVECOMMONS_Attribution.md` |
| Public Domain (PD) | 10,366 | `PUBLICDOMAIN_Attribution.md` |
*Note: Attribution files are provided for licenses typically requiring it or where creator information was available. Public Domain does not legally require attribution, but it's often appreciated.*
Full attribution details, listing creators/owners and their works under each license, are located in the `ATTRIBUTION/` directory, organized by license type.
## Further License Information
For detailed information about each license type, please refer to their official sources:
* **Creative Commons (CC) Licenses:** [https://creativecommons.org/licenses/](https://creativecommons.org/licenses/)
* **Public Domain:** Works in the public domain are free from copyright restrictions and can be used without permission.
# SVGFind 图标数据集卡片
### 数据集摘要
本数据集收录了大量源自[SVGFind.com](https://www.svgfind.com)的可缩放矢量图形(Scalable Vector Graphics,SVG)图标。该类图标覆盖丰富的品类与风格,可适用于用户界面、网页开发、演示文稿等场景,亦可用于训练矢量图形或图标分类模型。所有图标均在元数据中明确标注其授权类型:要么采用知识共享许可协议(Creative Commons license),要么属于公有领域(Public Domain)。本数据集内的SVG文件已通过[tdewolff/minify](https://github.com/tdewolff/minify)工具进行压缩优化,在保留视觉效果的前提下减小文件体积;数据集文件则采用Zstandard压缩算法进行压缩。
### 语言
数据集元数据(包括标题、标签)主要为英语:
- 英语(en)
## 数据集结构
### 数据文件
数据集组成如下:
- 元数据与SVG内容以压缩JSONL格式(`.jsonl.zst`)存储,采用Zstandard压缩算法。
- 数据集按照图标许可协议类型拆分为独立文件:
- `svgfind-CREATIVECOMMONS.jsonl.zst`
- `svgfind-PUBLICDOMAIN.jsonl.zst`
- 本数据集内的SVG文件已通过[tdewolff/minify](https://github.com/tdewolff/minify)工具进行压缩优化,在保留视觉效果的前提下减小文件体积。
- 许可归属详情存储于`ATTRIBUTION/`目录下的Markdown文件中,按照许可协议类型分类整理;此类归属文件同样采用Zstandard压缩(`.md.zst`)。
### 数据字段
JSONL文件中的每条记录包含以下字段:
- `id`:SVGFind.com上该图标的唯一标识符。
- `title`:图标名称或标题。
- `data_pack`:该图标在SVGFind平台所属的图标合集或包。
- `tags`:与该图标关联的标签字符串数组。
- `license`:该图标分发所遵循的具体许可协议(例如:"CREATIVECOMMONS"、"PUBLICDOMAIN")。
- `license_owner`:SVGFind平台上标注的许可协议所有者实体名称(创作者或企业)。
- `download_url`:SVGFind.com上该SVG文件的原始下载链接。
- `svg_content`:包含该图标SVG标记代码的字符串。
### 数据拆分
数据集按照图标关联的许可协议类型划分为子集:
| 子集名称 | 许可协议说明 | 样本数量 |
| :------------------ | :------------------------------------------------------- | -----------------: |
| `creativecommons` | 知识共享许可协议 | 3,645,444 |
| `publicdomain` | 公有领域 | 10,366 |
| **总计** | | **3,655,810** |
# 许可信息
## 许可协议结构
本数据集收录的图标均采用知识共享许可协议或公有领域授权。**本数据集内的每一枚图标均对应唯一的许可协议**,该协议在元数据的`license`字段中明确标注,并决定了该图标可被使用、修改及分发的方式。
## 数据集许可概览与归属文件
下表展示了本数据集内图标按许可协议的分布情况,以及对应包含详细归属信息的文件。文件名与`ATTRIBUTION/`目录下的实际文件一一对应。
| 许可协议 | 图标数量 | 归属文件 |
| :------------------------------------------------- | --------: | :------------------------------------- |
| 知识共享许可协议(CC) | 3,645,444 | `CREATIVECOMMONS_Attribution.md` |
| 公有领域(PD) | 10,366 | `PUBLICDOMAIN_Attribution.md` |
*注:归属文件为通常需要标注或可获取创作者信息的许可协议提供。公有领域在法律上无需标注归属信息,但标注归属通常是受欢迎的做法。*
完整的归属详情,即各许可协议下创作者/所有者及其对应作品的列表,存储于`ATTRIBUTION/`目录下,按照许可协议类型分类整理。
## 更多许可信息
若需了解各许可协议类型的详细信息,请参阅其官方来源:
* **知识共享许可协议(CC):** [https://creativecommons.org/licenses/](https://creativecommons.org/licenses/)
* **公有领域:** 公有领域作品不受版权限制,可无需许可直接使用。
提供机构:
maas
创建时间:
2025-05-06



