BenchGeckoAI/ai-model-benchmarks-2026
收藏Hugging Face2026-04-21 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/BenchGeckoAI/ai-model-benchmarks-2026
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- tabular-classification
- text-classification
language:
- en
tags:
- ai
- benchmarks
- llm
- pricing
- machine-learning
- gpt
- claude
- gemini
- deepseek
- mcp
pretty_name: AI Model Benchmarks & Pricing 2026
size_categories:
- n<1K
---
# AI Model Benchmarks & Pricing Dataset 2026
A comprehensive survey of large language model performance and economics, maintained by [BenchGecko](https://benchgecko.ai).
## What's Inside
| File | Records | Description |
|------|---------|-------------|
| `data/models.csv` | 20 | Top AI models with benchmark scores and API pricing |
| `data/providers.csv` | 20 | AI model providers with metadata |
| `data/benchmarks.csv` | 40 | Benchmark suites with methodology |
| `data/mcp_servers.csv` | 20 | Model Context Protocol servers |
This is a sample from the full dataset. The complete dataset covers thousands of models, hundreds of providers, and over a hundred benchmarks, updated every two hours at [benchgecko.ai](https://benchgecko.ai).
## Fields (models.csv)
| Column | Type | Description |
|--------|------|-------------|
| `name` | String | Model display name |
| `provider` | String | Organization that created the model |
| `input_price` | Float | USD per 1M input tokens |
| `output_price` | Float | USD per 1M output tokens |
| `context_window` | Integer | Maximum context length in tokens |
| `average_score` | Float | Weighted average across all benchmarks (0-100) |
| `mmlu_score` | Float | MMLU benchmark score |
| `humaneval_score` | Float | HumanEval coding score |
| `gpqa_score` | Float | GPQA Diamond science score |
| `math_score` | Float | MATH competition score |
| `open_source` | Boolean | Whether weights are publicly available |
| `release_date` | Date | Public release date |
## Quick Start
```python
from datasets import load_dataset
dataset = load_dataset("BenchGeckoAI/ai-model-benchmarks-2026")
models = dataset["train"]
# Find the best open-source model
open_models = [m for m in models if m["open_source"]]
best = max(open_models, key=lambda m: m["average_score"])
print(f"Best open model: {best['name']} ({best['average_score']})")
```
## Use Cases
- **Model Selection**: Compare benchmark scores across evaluation types before deploying
- **Cost Analysis**: Find the best price-to-performance ratio across providers
- **Market Research**: Track the AI model landscape and provider ecosystem
- **Academic Research**: Study capability trajectories and scaling laws
## Full Dataset
This sample covers 20 models. The full live dataset is available through:
- **Web**: [BenchGecko Model Rankings](https://benchgecko.ai/models)
- **API**: [BenchGecko API Documentation](https://benchgecko.ai/api-docs)
- **Pricing**: [Cross-Provider Pricing Comparison](https://benchgecko.ai/pricing)
- **Compare**: [Side-by-Side Model Comparison](https://benchgecko.ai/compare)
- **Economy**: [AI Economy Dashboard](https://benchgecko.ai/economy)
- **Compute**: [AI Compute Supply Chain](https://benchgecko.ai/compute)
- **Mindshare**: [Developer Mindshare Arena](https://benchgecko.ai/mindshare)
## Methodology
Benchmark scores sourced from original model technical reports and cross-verified using open-source evaluation frameworks (EleutherAI lm-evaluation-harness, BigCode HumanEval+). Pricing collected from official API documentation, updated within 48 hours of changes.
## Citation
```bibtex
@dataset{benchgecko2026,
author = {BenchGecko},
title = {AI Model Benchmarks and Pricing Dataset 2026},
year = {2026},
publisher = {Hugging Face},
url = {https://huggingface.co/datasets/BenchGeckoAI/ai-model-benchmarks-2026}
}
```
## License
CC BY 4.0. Attribution: BenchGecko (https://benchgecko.ai)
提供机构:
BenchGeckoAI



