five

BenchGeckoAI/ai-model-benchmarks-2026

收藏
Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/BenchGeckoAI/ai-model-benchmarks-2026
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - tabular-classification - text-classification language: - en tags: - ai - benchmarks - llm - pricing - machine-learning - gpt - claude - gemini - deepseek - mcp pretty_name: AI Model Benchmarks & Pricing 2026 size_categories: - n<1K --- # AI Model Benchmarks & Pricing Dataset 2026 A comprehensive survey of large language model performance and economics, maintained by [BenchGecko](https://benchgecko.ai). ## What's Inside | File | Records | Description | |------|---------|-------------| | `data/models.csv` | 20 | Top AI models with benchmark scores and API pricing | | `data/providers.csv` | 20 | AI model providers with metadata | | `data/benchmarks.csv` | 40 | Benchmark suites with methodology | | `data/mcp_servers.csv` | 20 | Model Context Protocol servers | This is a sample from the full dataset. The complete dataset covers thousands of models, hundreds of providers, and over a hundred benchmarks, updated every two hours at [benchgecko.ai](https://benchgecko.ai). ## Fields (models.csv) | Column | Type | Description | |--------|------|-------------| | `name` | String | Model display name | | `provider` | String | Organization that created the model | | `input_price` | Float | USD per 1M input tokens | | `output_price` | Float | USD per 1M output tokens | | `context_window` | Integer | Maximum context length in tokens | | `average_score` | Float | Weighted average across all benchmarks (0-100) | | `mmlu_score` | Float | MMLU benchmark score | | `humaneval_score` | Float | HumanEval coding score | | `gpqa_score` | Float | GPQA Diamond science score | | `math_score` | Float | MATH competition score | | `open_source` | Boolean | Whether weights are publicly available | | `release_date` | Date | Public release date | ## Quick Start ```python from datasets import load_dataset dataset = load_dataset("BenchGeckoAI/ai-model-benchmarks-2026") models = dataset["train"] # Find the best open-source model open_models = [m for m in models if m["open_source"]] best = max(open_models, key=lambda m: m["average_score"]) print(f"Best open model: {best['name']} ({best['average_score']})") ``` ## Use Cases - **Model Selection**: Compare benchmark scores across evaluation types before deploying - **Cost Analysis**: Find the best price-to-performance ratio across providers - **Market Research**: Track the AI model landscape and provider ecosystem - **Academic Research**: Study capability trajectories and scaling laws ## Full Dataset This sample covers 20 models. The full live dataset is available through: - **Web**: [BenchGecko Model Rankings](https://benchgecko.ai/models) - **API**: [BenchGecko API Documentation](https://benchgecko.ai/api-docs) - **Pricing**: [Cross-Provider Pricing Comparison](https://benchgecko.ai/pricing) - **Compare**: [Side-by-Side Model Comparison](https://benchgecko.ai/compare) - **Economy**: [AI Economy Dashboard](https://benchgecko.ai/economy) - **Compute**: [AI Compute Supply Chain](https://benchgecko.ai/compute) - **Mindshare**: [Developer Mindshare Arena](https://benchgecko.ai/mindshare) ## Methodology Benchmark scores sourced from original model technical reports and cross-verified using open-source evaluation frameworks (EleutherAI lm-evaluation-harness, BigCode HumanEval+). Pricing collected from official API documentation, updated within 48 hours of changes. ## Citation ```bibtex @dataset{benchgecko2026, author = {BenchGecko}, title = {AI Model Benchmarks and Pricing Dataset 2026}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/datasets/BenchGeckoAI/ai-model-benchmarks-2026} } ``` ## License CC BY 4.0. Attribution: BenchGecko (https://benchgecko.ai)
提供机构:
BenchGeckoAI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作