Transcribing audio data: raw transcripts from several transcription tools
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/6469895
下载链接
链接失效反馈官方服务:
资源简介:
This record contains a Dutch audio fragment and the raw transcripts from several automatic audio transcription tools. The audio fragment was recorded specifically for the purpose of testing these tools and was run through:
Amberscript
HappyScribe
Kaldi
NVIVO transcription
Sonix
SpokenOnline
Transcribe
Trint
Microsoft Word 365 Online
The raw transcripts were downloaded as .docx or .txt files and the .docx files saved as .odt. No edits to the transcripts were made before saving them, except an incidental removal of a personal email address or hyperlink. For comparison purposes, the audio file itself ("Test_interview_20220203.mp3") and a clean transcript ("Test_interview_cleaned_transcript.odt") are also made available.
At the time you are downloading these files, the quality of the (Dutch) speech-to-text conversion may have been improved by the respective supplier. For maximum clarity, each filename therefore includes the date on which the transcription was run (YYYY-MM-DD format).
创建时间:
2022-10-06



