jstack32/LatinAccents

Name: jstack32/LatinAccents
Creator: jstack32
Published: 2023-10-20 22:04:35
License: 暂无描述

Hugging Face2023-10-20 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/jstack32/LatinAccents

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: apache-2.0 size_categories: ab: - 10K<n<100K ar: - 100K<n<1M as: - 1K<n<10K ast: - n<1K az: - n<1K ba: - 100K<n<1M bas: - 1K<n<10K be: - 100K<n<1M bg: - 1K<n<10K bn: - 100K<n<1M br: - 10K<n<100K ca: - 1M<n<10M ckb: - 100K<n<1M cnh: - 1K<n<10K cs: - 10K<n<100K cv: - 10K<n<100K cy: - 100K<n<1M da: - 1K<n<10K de: - 100K<n<1M dv: - 10K<n<100K el: - 10K<n<100K en: - 1M<n<10M eo: - 1M<n<10M es: - 1M<n<10M et: - 10K<n<100K eu: - 100K<n<1M fa: - 100K<n<1M fi: - 10K<n<100K fr: - 100K<n<1M fy-NL: - 10K<n<100K ga-IE: - 1K<n<10K gl: - 10K<n<100K gn: - 1K<n<10K ha: - 1K<n<10K hi: - 10K<n<100K hsb: - 1K<n<10K hu: - 10K<n<100K hy-AM: - 1K<n<10K ia: - 10K<n<100K id: - 10K<n<100K ig: - 1K<n<10K it: - 100K<n<1M ja: - 10K<n<100K ka: - 10K<n<100K kab: - 100K<n<1M kk: - 1K<n<10K kmr: - 10K<n<100K ky: - 10K<n<100K lg: - 100K<n<1M lt: - 10K<n<100K lv: - 1K<n<10K mdf: - n<1K mhr: - 100K<n<1M mk: - n<1K ml: - 1K<n<10K mn: - 10K<n<100K mr: - 10K<n<100K mrj: - 10K<n<100K mt: - 10K<n<100K myv: - 1K<n<10K nan-tw: - 10K<n<100K ne-NP: - n<1K nl: - 10K<n<100K nn-NO: - n<1K or: - 1K<n<10K pa-IN: - 1K<n<10K pl: - 100K<n<1M pt: - 100K<n<1M rm-sursilv: - 1K<n<10K rm-vallader: - 1K<n<10K ro: - 10K<n<100K ru: - 100K<n<1M rw: - 1M<n<10M sah: - 1K<n<10K sat: - n<1K sc: - 1K<n<10K sk: - 10K<n<100K skr: - 1K<n<10K sl: - 10K<n<100K sr: - 1K<n<10K sv-SE: - 10K<n<100K sw: - 100K<n<1M ta: - 100K<n<1M th: - 100K<n<1M ti: - n<1K tig: - n<1K tok: - 1K<n<10K tr: - 10K<n<100K tt: - 10K<n<100K tw: - n<1K ug: - 10K<n<100K uk: - 10K<n<100K ur: - 100K<n<1M uz: - 100K<n<1M vi: - 10K<n<100K vot: - n<1K yue: - 10K<n<100K zh-CN: - 100K<n<1M zh-HK: - 100K<n<1M zh-TW: - 100K<n<1M source_datasets: - extended|common_voice task_categories: - automatic-speech-recognition dataset_info: features: - name: path dtype: string - name: audio dtype: int64 - name: sentence dtype: string splits: - name: train num_bytes: 102 num_examples: 2 download_size: 0 dataset_size: 102 configs: - config_name: default data_files: - split: train path: data/train-* --- # Dataset Card for Dataset Name  This dataset card aims to be a base template for new datasets. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/datasetcard_template.md?plain=1). ## Dataset Details ### Dataset Description  - **Curated by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Language(s) (NLP):** [More Information Needed] - **License:** [More Information Needed] ### Dataset Sources [optional]  - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses  ### Direct Use  [More Information Needed] ### Out-of-Scope Use  [More Information Needed] ## Dataset Structure  [More Information Needed] ## Dataset Creation ### Curation Rationale  [More Information Needed] ### Source Data  #### Data Collection and Processing  [More Information Needed] #### Who are the source data producers?  [More Information Needed] ### Annotations [optional]  #### Annotation process  [More Information Needed] #### Who are the annotators?  [More Information Needed] #### Personal and Sensitive Information  [More Information Needed] ## Bias, Risks, and Limitations  [More Information Needed] ### Recommendations  Users should be made aware of the risks, biases and limitations of the dataset. More information needed for further recommendations. ## Citation [optional]  **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional]  [More Information Needed] ## More Information [optional] [More Information Needed] ## Dataset Card Authors [optional] [More Information Needed] ## Dataset Card Contact [More Information Needed]

提供机构：

jstack32

原始信息汇总

数据集概述

语言和大小

语言: 多种语言
大小: 不同语言的数据集大小范围如下：
- ab: 10K<n<100K
- ar: 100K<n<1M
- as: 1K<n<10K
- ast: n<1K
- az: n<1K
- ba: 100K<n<1M
- bas: 1K<n<10K
- be: 100K<n<1M
- bg: 1K<n<10K
- bn: 100K<n<1M
- br: 10K<n<100K
- ca: 1M<n<10M
- ckb: 100K<n<1M
- cnh: 1K<n<10K
- cs: 10K<n<100K
- cv: 10K<n<100K
- cy: 100K<n<1M
- da: 1K<n<10K
- de: 100K<n<1M
- dv: 10K<n<100K
- el: 10K<n<100K
- en: 1M<n<10M
- eo: 1M<n<10M
- es: 1M<n<10M
- et: 10K<n<100K
- eu: 100K<n<1M
- fa: 100K<n<1M
- fi: 10K<n<100K
- fr: 100K<n<1M
- fy-NL: 10K<n<100K
- ga-IE: 1K<n<10K
- gl: 10K<n<100K
- gn: 1K<n<10K
- ha: 1K<n<10K
- hi: 10K<n<100K
- hsb: 1K<n<10K
- hu: 10K<n<100K
- hy-AM: 1K<n<10K
- ia: 10K<n<100K
- id: 10K<n<100K
- ig: 1K<n<10K
- it: 100K<n<1M
- ja: 10K<n<100K
- ka: 10K<n<100K
- kab: 100K<n<1M
- kk: 1K<n<10K
- kmr: 10K<n<100K
- ky: 10K<n<100K
- lg: 100K<n<1M
- lt: 10K<n<100K
- lv: 1K<n<10K
- mdf: n<1K
- mhr: 100K<n<1M
- mk: n<1K
- ml: 1K<n<10K
- mn: 10K<n<100K
- mr: 10K<n<100K
- mrj: 10K<n<100K
- mt: 10K<n<100K
- myv: 1K<n<10K
- nan-tw: 10K<n<100K
- ne-NP: n<1K
- nl: 10K<n<100K
- nn-NO: n<1K
- or: 1K<n<10K
- pa-IN: 1K<n<10K
- pl: 100K<n<1M
- pt: 100K<n<1M
- rm-sursilv: 1K<n<10K
- rm-vallader: 1K<n<10K
- ro: 10K<n<100K
- ru: 100K<n<1M
- rw: 1M<n<10M
- sah: 1K<n<10K
- sat: n<1K
- sc: 1K<n<10K
- sk: 10K<n<100K
- skr: 1K<n<10K
- sl: 10K<n<100K
- sr: 1K<n<10K
- sv-SE: 10K<n<100K
- sw: 100K<n<1M
- ta: 100K<n<1M
- th: 100K<n<1M
- ti: n<1K
- tig: n<1K
- tok: 1K<n<10K
- tr: 10K<n<100K
- tt: 10K<n<100K
- tw: n<1K
- ug: 10K<n<100K
- uk: 10K<n<100K
- ur: 100K<n<1M
- uz: 100K<n<1M
- vi: 10K<n<100K
- vot: n<1K
- yue: 10K<n<100K
- zh-CN: 100K<n<1M
- zh-HK: 100K<n<1M
- zh-TW: 100K<n<1M

数据集信息

特征:
- path: 字符串类型
- audio: 64位整数类型
- sentence: 字符串类型
分割:
- train: 102字节，2个样本
下载大小: 0字节
数据集大小: 102字节

配置

配置名称: default
数据文件:
- train: data/train-*

许可证

许可证: Apache 2.0

任务类别

任务类别: 自动语音识别

源数据集

源数据集: common_voice

5,000+

优质数据集

54 个

任务类型

进入经典数据集