qmeeus/MSNER
收藏Hugging Face2024-03-28 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/qmeeus/MSNER
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: audio_id
dtype: string
- name: language
dtype:
class_label:
names:
'0': en
'1': de
'2': fr
'3': es
'4': pl
'5': it
'6': ro
'7': hu
'8': cs
'9': nl
'10': fi
'11': hr
'12': sk
'13': sl
'14': et
'15': lt
'16': en_accented
- name: audio
dtype:
audio:
sampling_rate: 16000
- name: sentence
dtype: string
- name: unified_entities
sequence:
class_label:
names:
'0': B-cell_line
'1': B-character_name
'2': B-language
'3': B-disease
'4': B-event
'5': B-organization
'6': B-character
'7': B-origin
'8': B-other
'9': B-dish
'10': B-relationship
'11': B-artifact
'12': B-work_of_art
'13': B-facility
'14': B-product
'15': B-amenity
'16': B-rating
'17': B-actor
'18': B-date
'19': B-ratings_average
'20': B-quantity
'21': B-dna
'22': B-quote
'23': B-title
'24': B-song
'25': B-genre
'26': B-cuisine
'27': B-soundtrack
'28': B-ordinal_number
'29': B-protein
'30': B-collection
'31': B-money
'32': B-person
'33': B-project
'34': B-group
'35': B-review
'36': B-percent
'37': B-law
'38': B-director
'39': B-award
'40': B-chemical
'41': B-geopolitical_area
'42': B-rna
'43': B-restaurant
'44': B-location
'45': B-opinion
'46': B-cell_type
'47': B-trailer
'48': B-cardinal_number
'49': B-plot
'50': B-corporation
'51': B-time
'52': I-cell_line
'53': I-character_name
'54': I-language
'55': I-disease
'56': I-event
'57': I-organization
'58': I-character
'59': I-origin
'60': I-other
'61': I-dish
'62': I-relationship
'63': I-artifact
'64': I-work_of_art
'65': I-facility
'66': I-product
'67': I-amenity
'68': I-rating
'69': I-actor
'70': I-date
'71': I-ratings_average
'72': I-quantity
'73': I-dna
'74': I-quote
'75': I-title
'76': I-song
'77': I-genre
'78': I-cuisine
'79': I-soundtrack
'80': I-ordinal_number
'81': I-protein
'82': I-collection
'83': I-money
'84': I-person
'85': I-project
'86': I-group
'87': I-review
'88': I-percent
'89': I-law
'90': I-director
'91': I-award
'92': I-chemical
'93': I-geopolitical_area
'94': I-rna
'95': I-restaurant
'96': I-location
'97': I-opinion
'98': I-cell_type
'99': I-trailer
'100': I-cardinal_number
'101': I-plot
'102': I-corporation
'103': I-time
'104': O
- name: raw_entities
sequence:
class_label:
names:
'0': O
'1': B-cardinal number
'2': B-date
'3': I-date
'4': B-person
'5': I-person
'6': B-group
'7': B-geopolitical area
'8': I-geopolitical area
'9': B-law
'10': I-law
'11': B-organization
'12': I-organization
'13': B-percent
'14': I-percent
'15': B-ordinal number
'16': B-money
'17': I-money
'18': B-work of art
'19': I-work of art
'20': B-facility
'21': B-time
'22': I-cardinal number
'23': B-location
'24': B-quantity
'25': I-quantity
'26': I-group
'27': I-location
'28': B-product
'29': I-time
'30': B-event
'31': I-event
'32': I-facility
'33': B-language
'34': I-product
'35': I-ordinal number
'36': I-language
splits:
- name: de
num_bytes: 1131409390.498
num_examples: 1966
- name: es
num_bytes: 1141307288.576
num_examples: 1512
- name: fr
num_bytes: 1058324947.032
num_examples: 1656
- name: nl
num_bytes: 602932929.76
num_examples: 1120
download_size: 3332139599
dataset_size: 3933974555.866
configs:
- config_name: default
data_files:
- split: de
path: data/de-*
- split: es
path: data/es-*
- split: fr
path: data/fr-*
- split: nl
path: data/nl-*
language:
- de
- es
- fr
- nl
tags:
- spoken-ner
- multilingual
- MSNER
- spoken-language-understanding
pretty_name: MSNER
---
# Dataset Card for "spoken-ner"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
提供机构:
qmeeus
原始信息汇总
数据集概述
数据集信息
- 特征列表:
audio_id: 字符串类型language: 分类标签类型,包含以下语言:- en, de, fr, es, pl, it, ro, hu, cs, nl, fi, hr, sk, sl, et, lt, en_accented
audio: 音频类型,采样率为16000sentence: 字符串类型unified_entities: 序列分类标签类型,包含以下实体标签:- B-cell_line, B-character_name, B-language, B-disease, B-event, B-organization, B-character, B-origin, B-other, B-dish, B-relationship, B-artifact, B-work_of_art, B-facility, B-product, B-amenity, B-rating, B-actor, B-date, B-ratings_average, B-quantity, B-dna, B-quote, B-title, B-song, B-genre, B-cuisine, B-soundtrack, B-ordinal_number, B-protein, B-collection, B-money, B-person, B-project, B-group, B-review, B-percent, B-law, B-director, B-award, B-chemical, B-geopolitical_area, B-rna, B-restaurant, B-location, B-opinion, B-cell_type, B-trailer, B-cardinal_number, B-plot, B-corporation, B-time, I-cell_line, I-character_name, I-language, I-disease, I-event, I-organization, I-character, I-origin, I-other, I-dish, I-relationship, I-artifact, I-work_of_art, I-facility, I-product, I-amenity, I-rating, I-actor, I-date, I-ratings_average, I-quantity, I-dna, I-quote, I-title, I-song, I-genre, I-cuisine, I-soundtrack, I-ordinal_number, I-protein, I-collection, I-money, I-person, I-project, I-group, I-review, I-percent, I-law, I-director, I-award, I-chemical, I-geopolitical_area, I-rna, I-restaurant, I-location, I-opinion, I-cell_type, I-trailer, I-cardinal_number, I-plot, I-corporation, I-time, O
raw_entities: 序列分类标签类型,包含以下实体标签:- O, B-cardinal number, B-date, I-date, B-person, I-person, B-group, B-geopolitical area, I-geopolitical area, B-law, I-law, B-organization, I-organization, B-percent, I-percent, B-ordinal number, B-money, I-money, B-work of art, I-work of art, B-facility, B-time, I-cardinal number, B-location, B-quantity, I-quantity, I-group, I-location, B-product, I-time, B-event, I-event, I-facility, B-language, I-product, I-ordinal number, I-language
数据集分割
- de:
- 字节数: 1131409390.498
- 样本数: 1966
- es:
- 字节数: 1141307288.576
- 样本数: 1512
- fr:
- 字节数: 1058324947.032
- 样本数: 1656
- nl:
- 字节数: 602932929.76
- 样本数: 1120
数据集大小
- 下载大小: 3332139599
- 数据集大小: 3933974555.866
配置
- 配置名称: default
- 数据文件:
de: data/de-*es: data/es-*fr: data/fr-*nl: data/nl-*
- 数据文件:
语言
- de, es, fr, nl
标签
- spoken-ner, multilingual, MSNER, spoken-language-understanding
数据集名称
- MSNER



