mukuls9971/address-benchmark-v1
收藏Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/mukuls9971/address-benchmark-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: "mit"
pretty_name: "Indian Address Benchmark Dataset v1"
task_categories: ["token-classification"]
task_ids: ["named-entity-recognition"]
language: ["en", "hi"]
tags: ["pii", "ner", "token-classification", "benchmark", "addresses"]
---
# Indian Address Benchmark Dataset v1
Mixed benchmark dataset for Indian-address tagging built from synthetic data plus public upstream datasets.
## Repository
- Dataset repo: `mukuls9971/address-benchmark-v1`
- Train split: `26728`
- Validation split: `6158`
- Test split: `1410`
## Files
- `train.jsonl`
- `validation.jsonl`
- `test.jsonl`
- `report.json`
## Notes
- Generated and published by the `pii-model-oss` workflow.
- Upstream datasets used to assemble benchmark variants retain their own licenses.
## Warnings
- LinCE train/dev could not be fetched from the original host; used CodeMixBench ner_hineng test as a held-out-only fallback.
提供机构:
mukuls9971



