open-index/open-skills
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/open-index/open-skills
下载链接
链接失效反馈官方服务:
资源简介:
---
license: odc-by
task_categories:
- text-generation
- feature-extraction
- text-classification
language:
- en
pretty_name: "Open Skills: Complete skills.sh Archive"
size_categories:
- 100K<n<1M
tags:
- skills
- mcp
- ai-agents
- model-context-protocol
- agent-skills
- parquet
- developer-tools
- claude
- cursor
- windsurf
configs:
- config_name: default
data_files:
- split: train
path: data/skills.parquet
---
# Open Skills: Complete skills.sh Archive
> 133,149 agent skills from 8,808 publishers on skills.sh
## Table of Contents
- [What is this?](#what-is-this)
- [What is skills.sh?](#what-is-skillssh)
- [Dataset statistics](#dataset-statistics)
- [How to use](#how-to-use)
- [Dataset card](#dataset-card)
- [Structure](#structure)
- [Limitations](#limitations)
- [License and contact](#license-and-contact)
## What is this?
A full dump of [skills.sh](https://skills.sh) as a single Parquet file. Every skill listed on the site has been collected into this dataset: README content, install commands, weekly install counts, GitHub stars, security audit results, and per-platform install breakdowns. If it's on skills.sh, it's in here.
The file is sorted by weekly installs (most popular first) and compressed with Zstandard, so you can stream it directly from HuggingFace without downloading. Great for building recommendation systems, analyzing the MCP ecosystem, or just browsing what tools people are actually using.
**133,149 skills** from **8,808 publishers** across **13,460 repositories**. Collected on **2026-04-10**.
## What is skills.sh?
[skills.sh](https://skills.sh) is Vercel Labs' directory of agent skills for AI coding tools. A "skill" is an npm package you install via `npx` that gives your editor new abilities through the [Model Context Protocol](https://modelcontextprotocol.io). Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and others all support it.
Think of it as a package registry, but specifically for AI assistant plugins. Each skill has a one-line install command, a README explaining what it does, and weekly install counts broken down by platform. The site also runs security audits through [Socket](https://socket.dev), [Snyk](https://snyk.io), and Vercel's [Agent Trust Hub](https://vercel.com/trust-hub), so you can check if a skill is safe before installing it.
The ecosystem is growing fast. New skills show up daily, and the install numbers shift week to week as people try new tools.
## Dataset statistics
| Metric | Value |
|--------|------:|
| Total skills | 133,149 |
| With README | 132,763 (99.7%) |
| With install data | 129,842 (97.5%) |
| With GitHub stars | 111,503 (83.7%) |
| Unique publishers | 8,808 |
| Unique repos | 13,460 |
| Total weekly installs | 27,712,591 |
| Parquet size | 334.2 MB |
### Top publishers
| # | Publisher | Skills |
|--:|----------|-------:|
| 1 | membranedev | 2,979 |
| 2 | jeremylongshore | 1,927 |
| 3 | sickn33 | 1,396 |
| 4 | aaaaqwq | 1,037 |
| 5 | composiohq | 970 |
| 6 | mukul975 | 728 |
| 7 | davila7 | 713 |
| 8 | microsoft | 590 |
| 9 | sundial-org | 588 |
| 10 | syncfusion | 573 |
| 11 | teachingai | 539 |
| 12 | dykyi-roman | 520 |
| 13 | team-telnyx | 501 |
| 14 | yonatangross | 495 |
| 15 | neversight | 478 |
### Most installed skills
| # | Skill | Publisher | Weekly Installs | Stars |
|--:|-------|-----------|----------------:|------:|
| 1 | find-skills | vercel-labs | 950,600 | 13,500 |
| 2 | vercel-react-best-practices | vercel-labs | 300,700 | 24,800 |
| 3 | frontend-design | anthropics | 272,500 | 113,900 |
| 4 | web-design-guidelines | vercel-labs | 241,700 | 24,800 |
| 5 | remotion-best-practices | remotion-dev | 224,600 | 2,700 |
| 6 | agent-browser | vercel-labs | 171,400 | 28,400 |
| 7 | microsoft-foundry | microsoft | 156,500 | 605 |
| 8 | azure-prepare | microsoft | 156,100 | 605 |
| 9 | entra-app-registration | microsoft | 156,000 | 605 |
| 10 | azure-hosted-copilot-sdk | microsoft | 156,000 | 605 |
| 11 | azure-aigateway | microsoft | 156,000 | 605 |
| 12 | azure-deploy | microsoft | 156,000 | 605 |
| 13 | azure-messaging | microsoft | 156,000 | 605 |
| 14 | appinsights-instrumentation | microsoft | 155,900 | 605 |
| 15 | azure-validate | microsoft | 155,900 | 605 |
### Installs by platform
| # | Platform | Total Installs |
|--:|----------|---------------:|
| 1 | github-copilot | 23,402,803 |
| 2 | codex | 18,944,869 |
| 3 | opencode | 18,668,856 |
| 4 | gemini-cli | 18,375,746 |
| 5 | cursor | 14,148,382 |
| 6 | amp | 9,371,309 |
| 7 | kimi-cli | 8,386,214 |
| 8 | claude-code | 3,907,190 |
| 9 | cline | 2,495,626 |
| 10 | antigravity | 788,334 |
```
github-copilot █████████████████████████ 23,402,803
codex ████████████████████░░░░░ 18,944,869
opencode ███████████████████░░░░░░ 18,668,856
gemini-cli ███████████████████░░░░░░ 18,375,746
cursor ███████████████░░░░░░░░░░ 14,148,382
amp ██████████░░░░░░░░░░░░░░░ 9,371,309
kimi-cli ████████░░░░░░░░░░░░░░░░░ 8,386,214
claude-code ████░░░░░░░░░░░░░░░░░░░░░ 3,907,190
cline ██░░░░░░░░░░░░░░░░░░░░░░░ 2,495,626
antigravity █░░░░░░░░░░░░░░░░░░░░░░░░ 788,334
```
### Security audit coverage
| Audit Provider | Skills Covered | Coverage |
|----------------|---------------:|---------:|
| Agent Trust Hub | 128,592 | 96.6% |
| Socket.dev | 127,793 | 96.0% |
| Snyk | 127,918 | 96.1% |
### Install distribution
```
0 █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 3,307
1-10 ██████████████████████████████ 70,639
11-100 ███████████████████░░░░░░░░░░░ 46,951
101-1K ████░░░░░░░░░░░░░░░░░░░░░░░░░░ 10,173
1K-10K █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1,670
10K+ █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 409
```
## How to use
### DuckDB (no download needed)
```sql
-- Most installed skills
SELECT skill_id, name, weekly_installs, github_stars, owner
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
WHERE weekly_installs > 0
ORDER BY weekly_installs DESC
LIMIT 20;
```
```sql
-- Biggest publishers
SELECT owner, COUNT(*) AS skills, SUM(weekly_installs) AS total_installs
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
GROUP BY owner
ORDER BY skills DESC
LIMIT 20;
```
```sql
-- Most popular skills on Cursor specifically
SELECT skill_id, name, installs_cursor, weekly_installs
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
WHERE installs_cursor > 0
ORDER BY installs_cursor DESC
LIMIT 20;
```
```sql
-- Compare installs across platforms for top skills
SELECT skill_id, name,
installs_github_copilot AS copilot,
installs_cursor AS cursor,
installs_claude_code AS claude,
installs_codex AS codex
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
WHERE weekly_installs > 1000
ORDER BY weekly_installs DESC
LIMIT 20;
```
```sql
-- Skills with security audit results
SELECT skill_id, name, audit_trust_hub, audit_socket, audit_snyk
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
WHERE audit_trust_hub IS NOT NULL
OR audit_socket IS NOT NULL
OR audit_snyk IS NOT NULL
ORDER BY weekly_installs DESC
LIMIT 20;
```
```sql
-- Audit coverage summary
SELECT
COUNT(*) AS total,
COUNT(audit_trust_hub) AS trust_hub,
COUNT(audit_socket) AS socket,
COUNT(audit_snyk) AS snyk
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet');
```
```sql
-- Search READMEs
SELECT skill_id, name, weekly_installs
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
WHERE lower(readme_md) LIKE '%kubernetes%'
ORDER BY weekly_installs DESC
LIMIT 10;
```
### Python (datasets library)
```python
from datasets import load_dataset
ds = load_dataset("open-index/open-skills", split="train")
print(f"{len(ds):,} skills")
# or stream it
ds = load_dataset("open-index/open-skills", split="train", streaming=True)
for skill in ds:
print(skill["skill_id"], skill["weekly_installs"])
```
### Download the file
```bash
huggingface-cli download open-index/open-skills \
data/skills.parquet \
--repo-type dataset --local-dir ./open-skills/
```
Or with Python:
```python
from huggingface_hub import snapshot_download
snapshot_download("open-index/open-skills", repo_type="dataset", local_dir="./open-skills/")
```
### pandas + DuckDB
```python
import duckdb
df = duckdb.sql("""
SELECT skill_id, name, owner, weekly_installs, github_stars
FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet')
WHERE weekly_installs > 100
ORDER BY weekly_installs DESC
""").df()
print(df.head(20))
```
# Dataset card
## Structure
One Parquet file, one row per skill. Every column is either a string or int64. Platform install counts are flattened into individual columns, so you can filter and sort by platform directly without parsing JSON.
Here is what a row looks like:
```json
{
"skill_id": "vercel-labs/skills/web-search",
"owner": "vercel-labs",
"repo": "skills",
"name": "Web Search",
"install_cmd": "npx @vercel-labs/skills/web-search",
"summary_md": "Search the web.",
"readme_md": "## Web Search\n\nLets your AI assistant search the web...",
"weekly_installs": 15000,
"github_repo": "vercel-labs/skills",
"github_stars": 2500,
"first_seen": "March 2025",
"audit_trust_hub": "Verified",
"audit_socket": "No issues",
"audit_snyk": "No issues",
"installs_github_copilot": 5000,
"installs_codex": 3000,
"installs_cursor": 2000,
"installs_claude_code": 1500,
"installs_gemini_cli": 1000,
"installs_opencode": 800,
"installs_amp": 600,
"installs_kimi_cli": 400,
"installs_cline": 300,
"installs_antigravity": 100,
"url": "https://skills.sh/vercel-labs/skills/web-search",
"fetched_at": "2026-04-10T08:30:00Z"
}
```
### Fields
| Column | Type | What it is |
|--------|------|------------|
| `skill_id` | string | `owner/repo/skill` |
| `owner` | string | GitHub owner or org |
| `repo` | string | GitHub repo name |
| `name` | string | Display name |
| `install_cmd` | string | The `npx` install command |
| `summary_md` | string | Short summary in markdown |
| `readme_md` | string | Full README in markdown |
| `weekly_installs` | int64 | Installs this week (all platforms) |
| `github_repo` | string | GitHub repo path |
| `github_stars` | int64 | Star count at time of collection |
| `first_seen` | string | When it showed up on skills.sh |
| `audit_trust_hub` | string | Agent Trust Hub result |
| `audit_socket` | string | Socket.dev result |
| `audit_snyk` | string | Snyk result |
| `installs_github_copilot` | int64 | GitHub Copilot installs |
| `installs_codex` | int64 | Codex installs |
| `installs_opencode` | int64 | OpenCode installs |
| `installs_gemini_cli` | int64 | Gemini CLI installs |
| `installs_cursor` | int64 | Cursor installs |
| `installs_amp` | int64 | Amp installs |
| `installs_kimi_cli` | int64 | Kimi CLI installs |
| `installs_claude_code` | int64 | Claude Code installs |
| `installs_cline` | int64 | Cline installs |
| `installs_antigravity` | int64 | Antigravity installs |
| `url` | string | skills.sh page URL |
| `fetched_at` | timestamp | When we fetched it |
Single `train` split with all skills.
### Privacy
Only public data: skill names, publisher GitHub usernames, repo names, and README content that is already on GitHub. No emails, no private info. Everything in this dataset is visible on skills.sh without logging in.
## Limitations
- `summary_md` and `readme_md` are converted from the HTML on skills.sh pages. Some formatting may differ from the original GitHub source.
- Install counts and star counts are a snapshot from `fetched_at`. These numbers change daily on skills.sh, so treat them as approximate.
- Platform install columns (`installs_cursor`, etc.) are zero when the skill has no data for that platform, not null.
- About 66 skills with CJK characters or special symbols in their URL could not be fetched due to an encoding issue on the skills.sh side.
- This is metadata only. The actual skill source code lives on GitHub and is not included here.
## License and contact
Released under **ODC-By v1.0**. Original content belongs to its respective authors. Skills.sh is run by [Vercel Labs](https://github.com/vercel-labs). This dataset is not affiliated with or endorsed by Vercel.
Found a problem? Have a question? Open a thread on the [Community tab](https://huggingface.co/datasets/open-index/open-skills/discussions).
*Exported 2026-04-10*
提供机构:
open-index



