A dataset for recognition of Arabic accents from spoken L2 English speech (ArL2Eng)

Name: A dataset for recognition of Arabic accents from spoken L2 English speech (ArL2Eng)
Creator: Mnasri, Sami
Published: 2025-07-31 00:00:00
License: 暂无描述

Figshare2025-07-31 更新2026-04-08 收录

下载链接：

https://springernature.figshare.com/articles/dataset/A_dataset_for_recognition_of_Arabic_accents_from_spoken_L2_English_speech_ArL2Eng_/27893778/1

下载链接

链接失效反馈

官方服务：

资源简介：

ArL2Eng dataset, an L2 English speech corpus of Arabic learners, highlights the potential in supporting research automated language assessment. ArL2Eng comprises audio sequences from various Arabic backgrounds uttering English sentences. It is appropriately labeled by native Arabic speakers, which facilitates the research in numerous applications in accent recognition and speech processing. A large part of ArL2Eng (471 out of 640 records) of spoken samples, are annotated with fluency metrics from human expert raters. The dataset uses extracted features like Mel Frequency Cepstral Coefficients (MFCC) for phonetic and acoustic analysis. It is used to predict English fluency among Arab learners from various countries, using advanced deep learning techniques such as Random Forest and XGBoost, supported by dimensionality reduction. ArL2Eng is designed to support different applicative contexts, from multilingual speech recognition and accent classification to speaker identification. ArL2Eng provides a unique resource for both educators and researchers to design scalable and objective fluency evaluation models. The dataset is made public to boost the research in this field.

提供机构：

Mnasri, Sami

创建时间：

2025-07-31