Glissando-sp
收藏DataCite Commons2022-06-01 更新2025-04-15 收录
下载链接:
https://live.european-language-grid.eu/catalogue/corpus/1416
下载链接
链接失效反馈官方服务:
资源简介:
Glissando-sp includes more than 12 hours of speech in Spanish, recorded under optimal acoustic conditions, orthographically transcribed, phonetically aligned and annotated with prosodic information (location of the stressed syllables and prosodic phrasing). The corpus was recorded by 8 professional speakers and 20 non-professional speakers: 4 “news broadcaster” professional speakers (2 male and 2 female), 4 “advertising” professional speakers (2 male and 2 female), and 20 non-professional speakers (10 male and 10 female). Glissando-sp has been specially designed for prosodic studies, but can be used also for other purposes. Its structure, as well as the high number of speakers who recorded the corpus, makes the Glissando corpus especially suitable for inter-speaker and inter-style prosodic analyses.<p><p>Glissando-sp has an equivalent corpus for Catalan, Glissando-ca, with the same structure and features, which make them suitable also for inter language comparisons (see ELRA-T0407, http://catalog.elra.info/en-us/repository/browse/ELRA-S0407/).<p><p>Both corpora are the result of a coordinated project involving Pompeu Fabra University (UPF), the Autonomous University of Barcelona (UAB) and the University of Valladolid (UVA).<p><p>Glissando-sp is made of three subcorpora:<p>1) the “News” subcorpus contains the recordings of readings of real news texts (provided by “Cadena Ser” radio station), which were modified to meet the desired segmental and prosodic requirements established for the corpus (“Prosodic” subcorpus with 36 recordings of texts meeting prosodic criteria and “Phonetic” subcorpus with 36 recordings of texts meeting segmental criteria). It was recorded by 8 professional speakers, four of them having a “news broadcaster” and four an “advertising” profile. Four of them recorded both the “Prosodic” and “Phonetic” subcorpora, and four only the “Prosodic” subcorpus. Every text was designed to be read in one minute approximately, although the actual duration of the recordings depends on the speaker.<p><p>2) the “Task dialogues” subcorpus contains a set of recorded interactions between two speakers oriented to a specific goal in the domain of information requests. In each conversation, one of the speakers plays the role of instruction-giver and the other, the role of instruction follower. Three types of interactions were recorded: a) telephone-like conversations between an operator and a customer who wants information on prices and schedules of a specific route, b) information requests for an exchange university between a school’s administrative officer that provides information on the possibilities for a course at a foreign university and a student who requests for it, and c) one of the speakers plays the role of somebody who is planning a trip to the Greek island of Corfu, and calls a colleague who has lived for 5 years in Greece, in order to request for specific information concerning the route on the island. There is no specific route to reproduce; there is only an initial and a final point of the trip, and some places to visit on the way. These tasks were performed by 12 different pairs of speakers: 1 pair of “news broadcaster” professional speakers, 1 pair of “advertising” professional speakers, and 10 pairs of non-professional speakers.<p><p>3) the “Free dialogues” subcorpus contains a set of recordings of conversations between people who have some degree of familiarity with each other. The dialogue was started from the question “Do you remember how you met each other?”, but the speakers were free to change to other topics during their conversation. These conversations were recorded by 6 different pairs of speakers: 1 pair of “news broadcaster” professional speakers, 1 pair of “advertising” professional speakers, 4 pairs of non-professional speakers.<p><p>Recordings were made at a soundproof room of the Audiovisual Media Service of the University of Valladolid, in Valladolid. A Marantz PMD670/W1B and a Marantz PMD560 recorders, with a Mackie CR1604-VLZ mixer, were used for recordings, at a sampling frequency of 44 KHz.<p><p>All the recordings were made using two microphones for each speaker: a fixed directional one (Neumann TLM103 P48) and a headset wireless one (Senheisser EW100-G2).<p><p>Recordings were stored in wav files: mono files for the “News” subcorpus and stereo files, containing in separate channels the speech of the two participants in the conversation (they were recorded using different microphones), for the “Task” and “Free” dialogues.<p><p>The corpus includes the orthographic transcription of the recordings in separate files: txt files, containing only the raw text, in the case of the “News” corpus (these files contain the actual text read by every speaker) and xml files, containing an enriched transcription of the conversations, carried out by human transcribers, following TEI conventions, in the case of “Task” and “Free” dialogues.<p><p>Word-by-word orthographic transcription is also provided in a Praat TextGrid file, timealigned with the signal. This Praat TextGrid file includes also the phonetic transcription of the recordings, timealigned with the speech signal: automatically transcribed from the news texts, automatically aligned and then revised by human experts, in the case of the “News” subcorpus, and automatically transcribed from the orthographical transcriptions of conversations and automatically aligned, in the case of the “Task” and “Free” dialogues subcorpora.<p><p>The phonetic transcription was done using the SAMPA phonetic alphabet.<p><p>The TextGrid file includes also three tiers with the segmentation in syllables, major intonation groups and minor intonation groups: obtained automatically using prosodic annotation tools and then revised by human experts, in the case of the “News” subcorpus, and obtained automatically using prosodic annotation tools, in the case of the “Task” and “Free” dialogues subcorpora.
Glissando-sp 包含超过12小时的西班牙语语音数据,采集于最优声学环境下,经正字法转录、语音对齐,并标注了韵律信息(重读音节位置与韵律短语)。该语料库由8名专业发声者与20名非专业发声者录制:其中4名为“新闻主播”类专业发声者(2男2女),4名为“广告配音”类专业发声者(2男2女),剩余20名为非专业发声者(10男10女)。Glissando-sp 专为韵律研究设计,但同样可应用于其他研究场景。其合理的结构设计与丰富的发声者样本,使其尤其适用于跨发声者与跨风格的韵律分析。
Glissando-sp 设有对应的加泰罗尼亚语版本Glissando-ca,二者结构与特征一致,同样适用于跨语言对比研究(详见 ELRA-T0407,http://catalog.elra.info/en-us/repository/browse/ELRA-S0407/)。
上述两个语料库均由庞培法布拉大学(Pompeu Fabra University, UPF)、巴塞罗那自治大学(Autonomous University of Barcelona, UAB)与巴利亚多利德大学(University of Valladolid, UVA)联合发起的合作项目研发。
Glissando-sp 包含三个子语料库:
1. “新闻”子语料库:收录真实新闻文本的朗读录音(由西班牙“Cadena Ser”广播电台提供),并根据本语料库的音段与韵律要求进行了适配调整,分为符合韵律标准的“韵律”子库(含36条录音)与符合音段标准的“语音”子库(含36条录音)。该子库由8名专业发声者录制,其中4名同时兼具“新闻主播”与“广告配音”身份,剩余4名仅参与“韵律”子库的录制。每条文本的设计朗读时长约为1分钟,但实际录音时长因发声者个体差异有所不同。
2. “任务型对话”子语料库:收录两组发声者间围绕特定信息查询目标的交互录音。对话中,一名发声者扮演指令发布者,另一名扮演指令执行者。本次共录制三类交互场景:a)类电话沟通对话,场景为客服与咨询特定航线票价与时刻表的乘客;b)高校交换生咨询场景,一方为学校行政人员,负责介绍境外高校课程的就读可能性,另一方为咨询的学生;c)规划希腊科孚岛旅行的人员致电曾在希腊居住5年的同事,查询岛上出行路线相关信息。本次任务无固定路线,仅设定旅行的起点与终点以及沿途若干游览地点。该子库由12组发声者完成录制:1组“新闻主播”专业发声者、1组“广告配音”专业发声者,以及10组非专业发声者。
3. “自由对话”子语料库:收录具有一定熟悉度的人员间的对话录音。对话以“你还记得你们是怎么认识的吗?”为开场,但发声者可自由切换对话话题。该子库由6组发声者完成录制:1组“新闻主播”专业发声者、1组“广告配音”专业发声者,以及4组非专业发声者。
所有录音均在巴利亚多利德大学视听媒体服务部的隔音室内完成。录制设备采用Marantz PMD670/W1B与Marantz PMD560录音机,搭配Mackie CR1604-VLZ调音台,采样率设置为44kHz。
每位发声者均使用两支麦克风采集语音:一支固定指向式麦克风(Neumann TLM103 P48)与一支头戴式无线麦克风(Sennheiser EW100-G2)。
录音存储格式如下:“新闻”子语料库采用单声道WAV文件;“任务型对话”与“自由对话”子语料库采用立体声WAV文件,两个声道分别存储对话双方的语音(使用不同麦克风独立录制)。
语料库附带独立的正字法转录文件:“新闻”子语料库的转录存储为TXT文件,仅包含发声者实际朗读的原始文本;“任务型对话”与“自由对话”子语料库的对话转录存储为XML文件,由人工转录员遵循文本编码倡议(Text Encoding Initiative, TEI)规范完成富化转录。
此外,还提供了逐词正字法转录的Praat TextGrid文件,该文件与语音信号完全对齐。该TextGrid文件同时包含语音转录内容并与语音信号对齐:“新闻”子语料库的语音转录由新闻文本自动生成,经自动对齐后由人工专家审核;“任务型对话”与“自由对话”子语料库的语音转录则由对话的正字法转录自动生成并完成自动对齐。
语音转录采用SAMPA语音音标体系。
TextGrid文件还包含三个层级的分割标注:音节、主要语调组与次要语调组。“新闻”子语料库的该类标注由韵律标注工具自动生成后经人工专家审核,“任务型对话”与“自由对话”子语料库的标注则仅由韵律标注工具自动生成。
提供机构:
ELG
创建时间:
2022-06-01



