i4ds/swiss-ner
收藏Hugging Face2025-07-18 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/i4ds/swiss-ner
下载链接
链接失效反馈官方服务:
资源简介:
SwissNER-Spoken是一个经过精心策划的spoken-style德语句子集合,共包含173个简短的句子,旨在评估针对瑞士特定专有名词的命名实体识别(NER)和自动语音识别(ASR)系统。每个句子包含多达三个在日常瑞士语境中出现的命名实体,如城市、村庄、州、公司、山脉、湖泊、河流、地标、组织机构和知名人士。数据集覆盖了所有26个瑞士州,并提供了实体的语言标签,以测试模型对语言敏感的识别能力。数据集的架构简洁,包含四个列:文本、命名实体、实体类型和实体语言。句子风格简短、自然,适合语音或文本模型处理现实世界中的瑞士专有名词。
SwissNER-Spoken is a curated collection of 173 short, spoken-style German sentences designed to evaluate Named-Entity Recognition (NER) and Automatic Speech Recognition (ASR) systems on Swiss-specific proper nouns. Each sentence contains up to three named entities that appear in everyday Swiss contexts such as cities, villages, cantons, companies, mountains, lakes, rivers, landmarks, organizations, and well-known personalities. The dataset covers all 26 Swiss cantons and provides language tags for entities to test language-aware recognition. The schema is compact, consisting of four CSV columns: text, named_entities, entity_types, and entity_languages. The sentences are intentionally short, natural, and pronunciation-friendly, making the corpus ideal for measuring how well speech or text models handle Swiss proper nouns in real-world utterances.
提供机构:
i4ds



