five

i4ds/swiss-ner

收藏
Hugging Face2025-07-18 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/i4ds/swiss-ner
下载链接
链接失效反馈
官方服务:
资源简介:
SwissNER-Spoken是一个经过精心策划的spoken-style德语句子集合,共包含173个简短的句子,旨在评估针对瑞士特定专有名词的命名实体识别(NER)和自动语音识别(ASR)系统。每个句子包含多达三个在日常瑞士语境中出现的命名实体,如城市、村庄、州、公司、山脉、湖泊、河流、地标、组织机构和知名人士。数据集覆盖了所有26个瑞士州,并提供了实体的语言标签,以测试模型对语言敏感的识别能力。数据集的架构简洁,包含四个列:文本、命名实体、实体类型和实体语言。句子风格简短、自然,适合语音或文本模型处理现实世界中的瑞士专有名词。

SwissNER-Spoken is a curated collection of 173 short, spoken-style German sentences designed to evaluate Named-Entity Recognition (NER) and Automatic Speech Recognition (ASR) systems on Swiss-specific proper nouns. Each sentence contains up to three named entities that appear in everyday Swiss contexts such as cities, villages, cantons, companies, mountains, lakes, rivers, landmarks, organizations, and well-known personalities. The dataset covers all 26 Swiss cantons and provides language tags for entities to test language-aware recognition. The schema is compact, consisting of four CSV columns: text, named_entities, entity_types, and entity_languages. The sentences are intentionally short, natural, and pronunciation-friendly, making the corpus ideal for measuring how well speech or text models handle Swiss proper nouns in real-world utterances.
提供机构:
i4ds
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作