five

open-index/open-skills

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/open-index/open-skills
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: odc-by task_categories: - text-generation - feature-extraction - text-classification language: - en pretty_name: "Open Skills: Complete skills.sh Archive" size_categories: - 100K<n<1M tags: - skills - mcp - ai-agents - model-context-protocol - agent-skills - parquet - developer-tools - claude - cursor - windsurf configs: - config_name: default data_files: - split: train path: data/skills.parquet --- # Open Skills: Complete skills.sh Archive > 133,149 agent skills from 8,808 publishers on skills.sh ## Table of Contents - [What is this?](#what-is-this) - [What is skills.sh?](#what-is-skillssh) - [Dataset statistics](#dataset-statistics) - [How to use](#how-to-use) - [Dataset card](#dataset-card) - [Structure](#structure) - [Limitations](#limitations) - [License and contact](#license-and-contact) ## What is this? A full dump of [skills.sh](https://skills.sh) as a single Parquet file. Every skill listed on the site has been collected into this dataset: README content, install commands, weekly install counts, GitHub stars, security audit results, and per-platform install breakdowns. If it's on skills.sh, it's in here. The file is sorted by weekly installs (most popular first) and compressed with Zstandard, so you can stream it directly from HuggingFace without downloading. Great for building recommendation systems, analyzing the MCP ecosystem, or just browsing what tools people are actually using. **133,149 skills** from **8,808 publishers** across **13,460 repositories**. Collected on **2026-04-10**. ## What is skills.sh? [skills.sh](https://skills.sh) is Vercel Labs' directory of agent skills for AI coding tools. A "skill" is an npm package you install via `npx` that gives your editor new abilities through the [Model Context Protocol](https://modelcontextprotocol.io). Claude Code, Cursor, Windsurf, GitHub Copilot, Gemini CLI, and others all support it. Think of it as a package registry, but specifically for AI assistant plugins. Each skill has a one-line install command, a README explaining what it does, and weekly install counts broken down by platform. The site also runs security audits through [Socket](https://socket.dev), [Snyk](https://snyk.io), and Vercel's [Agent Trust Hub](https://vercel.com/trust-hub), so you can check if a skill is safe before installing it. The ecosystem is growing fast. New skills show up daily, and the install numbers shift week to week as people try new tools. ## Dataset statistics | Metric | Value | |--------|------:| | Total skills | 133,149 | | With README | 132,763 (99.7%) | | With install data | 129,842 (97.5%) | | With GitHub stars | 111,503 (83.7%) | | Unique publishers | 8,808 | | Unique repos | 13,460 | | Total weekly installs | 27,712,591 | | Parquet size | 334.2 MB | ### Top publishers | # | Publisher | Skills | |--:|----------|-------:| | 1 | membranedev | 2,979 | | 2 | jeremylongshore | 1,927 | | 3 | sickn33 | 1,396 | | 4 | aaaaqwq | 1,037 | | 5 | composiohq | 970 | | 6 | mukul975 | 728 | | 7 | davila7 | 713 | | 8 | microsoft | 590 | | 9 | sundial-org | 588 | | 10 | syncfusion | 573 | | 11 | teachingai | 539 | | 12 | dykyi-roman | 520 | | 13 | team-telnyx | 501 | | 14 | yonatangross | 495 | | 15 | neversight | 478 | ### Most installed skills | # | Skill | Publisher | Weekly Installs | Stars | |--:|-------|-----------|----------------:|------:| | 1 | find-skills | vercel-labs | 950,600 | 13,500 | | 2 | vercel-react-best-practices | vercel-labs | 300,700 | 24,800 | | 3 | frontend-design | anthropics | 272,500 | 113,900 | | 4 | web-design-guidelines | vercel-labs | 241,700 | 24,800 | | 5 | remotion-best-practices | remotion-dev | 224,600 | 2,700 | | 6 | agent-browser | vercel-labs | 171,400 | 28,400 | | 7 | microsoft-foundry | microsoft | 156,500 | 605 | | 8 | azure-prepare | microsoft | 156,100 | 605 | | 9 | entra-app-registration | microsoft | 156,000 | 605 | | 10 | azure-hosted-copilot-sdk | microsoft | 156,000 | 605 | | 11 | azure-aigateway | microsoft | 156,000 | 605 | | 12 | azure-deploy | microsoft | 156,000 | 605 | | 13 | azure-messaging | microsoft | 156,000 | 605 | | 14 | appinsights-instrumentation | microsoft | 155,900 | 605 | | 15 | azure-validate | microsoft | 155,900 | 605 | ### Installs by platform | # | Platform | Total Installs | |--:|----------|---------------:| | 1 | github-copilot | 23,402,803 | | 2 | codex | 18,944,869 | | 3 | opencode | 18,668,856 | | 4 | gemini-cli | 18,375,746 | | 5 | cursor | 14,148,382 | | 6 | amp | 9,371,309 | | 7 | kimi-cli | 8,386,214 | | 8 | claude-code | 3,907,190 | | 9 | cline | 2,495,626 | | 10 | antigravity | 788,334 | ``` github-copilot █████████████████████████ 23,402,803 codex ████████████████████░░░░░ 18,944,869 opencode ███████████████████░░░░░░ 18,668,856 gemini-cli ███████████████████░░░░░░ 18,375,746 cursor ███████████████░░░░░░░░░░ 14,148,382 amp ██████████░░░░░░░░░░░░░░░ 9,371,309 kimi-cli ████████░░░░░░░░░░░░░░░░░ 8,386,214 claude-code ████░░░░░░░░░░░░░░░░░░░░░ 3,907,190 cline ██░░░░░░░░░░░░░░░░░░░░░░░ 2,495,626 antigravity █░░░░░░░░░░░░░░░░░░░░░░░░ 788,334 ``` ### Security audit coverage | Audit Provider | Skills Covered | Coverage | |----------------|---------------:|---------:| | Agent Trust Hub | 128,592 | 96.6% | | Socket.dev | 127,793 | 96.0% | | Snyk | 127,918 | 96.1% | ### Install distribution ``` 0 █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 3,307 1-10 ██████████████████████████████ 70,639 11-100 ███████████████████░░░░░░░░░░░ 46,951 101-1K ████░░░░░░░░░░░░░░░░░░░░░░░░░░ 10,173 1K-10K █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 1,670 10K+ █░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 409 ``` ## How to use ### DuckDB (no download needed) ```sql -- Most installed skills SELECT skill_id, name, weekly_installs, github_stars, owner FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') WHERE weekly_installs > 0 ORDER BY weekly_installs DESC LIMIT 20; ``` ```sql -- Biggest publishers SELECT owner, COUNT(*) AS skills, SUM(weekly_installs) AS total_installs FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') GROUP BY owner ORDER BY skills DESC LIMIT 20; ``` ```sql -- Most popular skills on Cursor specifically SELECT skill_id, name, installs_cursor, weekly_installs FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') WHERE installs_cursor > 0 ORDER BY installs_cursor DESC LIMIT 20; ``` ```sql -- Compare installs across platforms for top skills SELECT skill_id, name, installs_github_copilot AS copilot, installs_cursor AS cursor, installs_claude_code AS claude, installs_codex AS codex FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') WHERE weekly_installs > 1000 ORDER BY weekly_installs DESC LIMIT 20; ``` ```sql -- Skills with security audit results SELECT skill_id, name, audit_trust_hub, audit_socket, audit_snyk FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') WHERE audit_trust_hub IS NOT NULL OR audit_socket IS NOT NULL OR audit_snyk IS NOT NULL ORDER BY weekly_installs DESC LIMIT 20; ``` ```sql -- Audit coverage summary SELECT COUNT(*) AS total, COUNT(audit_trust_hub) AS trust_hub, COUNT(audit_socket) AS socket, COUNT(audit_snyk) AS snyk FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet'); ``` ```sql -- Search READMEs SELECT skill_id, name, weekly_installs FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') WHERE lower(readme_md) LIKE '%kubernetes%' ORDER BY weekly_installs DESC LIMIT 10; ``` ### Python (datasets library) ```python from datasets import load_dataset ds = load_dataset("open-index/open-skills", split="train") print(f"{len(ds):,} skills") # or stream it ds = load_dataset("open-index/open-skills", split="train", streaming=True) for skill in ds: print(skill["skill_id"], skill["weekly_installs"]) ``` ### Download the file ```bash huggingface-cli download open-index/open-skills \ data/skills.parquet \ --repo-type dataset --local-dir ./open-skills/ ``` Or with Python: ```python from huggingface_hub import snapshot_download snapshot_download("open-index/open-skills", repo_type="dataset", local_dir="./open-skills/") ``` ### pandas + DuckDB ```python import duckdb df = duckdb.sql(""" SELECT skill_id, name, owner, weekly_installs, github_stars FROM read_parquet('hf://datasets/open-index/open-skills/data/skills.parquet') WHERE weekly_installs > 100 ORDER BY weekly_installs DESC """).df() print(df.head(20)) ``` # Dataset card ## Structure One Parquet file, one row per skill. Every column is either a string or int64. Platform install counts are flattened into individual columns, so you can filter and sort by platform directly without parsing JSON. Here is what a row looks like: ```json { "skill_id": "vercel-labs/skills/web-search", "owner": "vercel-labs", "repo": "skills", "name": "Web Search", "install_cmd": "npx @vercel-labs/skills/web-search", "summary_md": "Search the web.", "readme_md": "## Web Search\n\nLets your AI assistant search the web...", "weekly_installs": 15000, "github_repo": "vercel-labs/skills", "github_stars": 2500, "first_seen": "March 2025", "audit_trust_hub": "Verified", "audit_socket": "No issues", "audit_snyk": "No issues", "installs_github_copilot": 5000, "installs_codex": 3000, "installs_cursor": 2000, "installs_claude_code": 1500, "installs_gemini_cli": 1000, "installs_opencode": 800, "installs_amp": 600, "installs_kimi_cli": 400, "installs_cline": 300, "installs_antigravity": 100, "url": "https://skills.sh/vercel-labs/skills/web-search", "fetched_at": "2026-04-10T08:30:00Z" } ``` ### Fields | Column | Type | What it is | |--------|------|------------| | `skill_id` | string | `owner/repo/skill` | | `owner` | string | GitHub owner or org | | `repo` | string | GitHub repo name | | `name` | string | Display name | | `install_cmd` | string | The `npx` install command | | `summary_md` | string | Short summary in markdown | | `readme_md` | string | Full README in markdown | | `weekly_installs` | int64 | Installs this week (all platforms) | | `github_repo` | string | GitHub repo path | | `github_stars` | int64 | Star count at time of collection | | `first_seen` | string | When it showed up on skills.sh | | `audit_trust_hub` | string | Agent Trust Hub result | | `audit_socket` | string | Socket.dev result | | `audit_snyk` | string | Snyk result | | `installs_github_copilot` | int64 | GitHub Copilot installs | | `installs_codex` | int64 | Codex installs | | `installs_opencode` | int64 | OpenCode installs | | `installs_gemini_cli` | int64 | Gemini CLI installs | | `installs_cursor` | int64 | Cursor installs | | `installs_amp` | int64 | Amp installs | | `installs_kimi_cli` | int64 | Kimi CLI installs | | `installs_claude_code` | int64 | Claude Code installs | | `installs_cline` | int64 | Cline installs | | `installs_antigravity` | int64 | Antigravity installs | | `url` | string | skills.sh page URL | | `fetched_at` | timestamp | When we fetched it | Single `train` split with all skills. ### Privacy Only public data: skill names, publisher GitHub usernames, repo names, and README content that is already on GitHub. No emails, no private info. Everything in this dataset is visible on skills.sh without logging in. ## Limitations - `summary_md` and `readme_md` are converted from the HTML on skills.sh pages. Some formatting may differ from the original GitHub source. - Install counts and star counts are a snapshot from `fetched_at`. These numbers change daily on skills.sh, so treat them as approximate. - Platform install columns (`installs_cursor`, etc.) are zero when the skill has no data for that platform, not null. - About 66 skills with CJK characters or special symbols in their URL could not be fetched due to an encoding issue on the skills.sh side. - This is metadata only. The actual skill source code lives on GitHub and is not included here. ## License and contact Released under **ODC-By v1.0**. Original content belongs to its respective authors. Skills.sh is run by [Vercel Labs](https://github.com/vercel-labs). This dataset is not affiliated with or endorsed by Vercel. Found a problem? Have a question? Open a thread on the [Community tab](https://huggingface.co/datasets/open-index/open-skills/discussions). *Exported 2026-04-10*
提供机构:
open-index
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作