adamrida/tracer-banking77
收藏Hugging Face2026-03-29 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/adamrida/tracer-banking77
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: input
dtype: string
- name: teacher
dtype: string
splits:
- name: train
num_examples: 10003
- name: test
num_examples: 3080
license: mit
task_categories:
- text-classification
language:
- en
tags:
- tracer
- banking77
- intent-classification
- llm-routing
- embeddings
pretty_name: TRACER Banking77 Traces
size_categories:
- 10K<n<100K
---
# TRACER Banking77 Traces
Pre-computed traces and BGE-M3 embeddings for the [Banking77](https://huggingface.co/datasets/PolyAI/banking77) intent classification dataset, formatted for use with [TRACER](https://github.com/adrida/tracer).
## Files
| File | Size | Description |
|------|------|-------------|
| `banking77_traces.jsonl` | 2.1 MB | 10,003 traces. Each line: `{"input": "...", "teacher": "label"}` |
| `banking77_embeddings.npy` | 39 MB | `(10003, 1024)` float32 -- BGE-M3 embeddings for train traces |
| `banking77_test_embeddings.npy` | 12 MB | `(3080, 1024)` float32 -- BGE-M3 embeddings for test set |
## Usage with TRACER
```python
from huggingface_hub import hf_hub_download
import numpy as np
import tracer
traces = hf_hub_download("adamrida/tracer-banking77", "banking77_traces.jsonl", repo_type="dataset")
X = np.load(hf_hub_download("adamrida/tracer-banking77", "banking77_embeddings.npy", repo_type="dataset"))
result = tracer.fit(traces, embeddings=X)
print(f"Coverage: {result.manifest.coverage_cal:.1%}")
```
## Embedding model
All embeddings were computed with [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) (1024-dim, L2-normalized).
## Source
Banking77 is a 77-class intent detection dataset from [PolyAI](https://github.com/PolyAI-LDN/task-specific-datasets). Teacher labels were generated by GPT-5.
## License
MIT
提供机构:
adamrida



