Glissando-ca
收藏DataCite Commons2022-06-01 更新2024-07-13 收录
下载链接:
https://live.european-language-grid.eu/catalogue/corpus/1417
下载链接
链接失效反馈官方服务:
资源简介:
Glissando-ca includes more than 12 hours of speech in Catalan, recorded under optimal acoustic conditions, orthographically transcribed, phonetically aligned and annotated with prosodic information (location of the stressed syllables and prosodic phrasing). The corpus was recorded by 8 professional speakers and 20 non-professional speakers: 4 “news broadcaster” professional speakers (2 male and 2 female), 4 “advertising” professional speakers (2 male and 2 female), and 20 non-professional speakers (10 male and 10 female). Glissando-sp has been specially designed for prosodic studies, but can be used also for other purposes. Its structure, as well as the high number of speakers who recorded the corpus, makes the Glissando corpus especially suitable for inter-speaker and inter-style prosodic analyses.<p><p>Glissando-ca has an equivalent corpus for Spanish, Glissando-sp, with the same structure and features, which make them suitable also for inter language comparisons (see ELRA-T0406, http://catalog.elra.info/en-us/repository/browse/ELRA-S0406/).<p><p>Both corpora are the result of a coordinated project involving Pompeu Fabra University (UPF), the Autonomous University of Barcelona (UAB) and the University of Valladolid (UVA).<p><p>Glissando-ca is made of three subcorpora:<p>1) the “News” subcorpus contains the recordings of readings of real news texts (provided by “Cadena Ser” radio station), which were modified to meet the desired segmental and prosodic requirements established for the corpus (“Prosodic” subcorpus with 36 recordings of texts meeting prosodic criteria and “Phonetic” subcorpus with 36 recordings of texts meeting segmental criteria). It was recorded by 8 professional speakers, four of them having a “news broadcaster” and four an “advertising” profile. Four of them recorded both the “Prosodic” and “Phonetic” subcorpora, and four only the “Prosodic” subcorpus. Every text was designed to be read in one minute approximately, although the actual duration of the recordings depends on the speaker.<p><p>2) the “Task dialogues” subcorpus contains a set of recorded interactions between two speakers oriented to a specific goal in the domain of information requests. In each conversation, one of the speakers plays the role of instruction-giver and the other, the role of instruction follower. Three types of interactions were recorded: a) telephone-like conversations between an operator and a customer who wants information on prices and schedules of a specific route, b) information requests for an exchange university between a school’s administrative officer that provides information on the possibilities for a course at a foreign university and a student who requests for it, and c) one of the speakers plays the role of somebody who is planning a trip to the Greek island of Corfu, and calls a colleague who has lived for 5 years in Greece, in order to request for specific information concerning the route on the island. There is no specific route to reproduce; there is only an initial and a final point of the trip, and some places to visit on the way. These tasks were performed by 12 different pairs of speakers: 1 pair of “news broadcaster” professional speakers, 1 pair of “advertising” professional speakers, and 10 pairs of non-professional speakers.<p><p>3) the “Free dialogues” subcorpus contains a set of recordings of conversations between people who have some degree of familiarity with each other. The dialogue was started from the question “Do you remember how you met each other?”, but the speakers were free to change to other topics during their conversation. These conversations were recorded by 6 different pairs of speakers: 1 pair of “news broadcaster” professional speakers, 1 pair of “advertising” professional speakers, 4 pairs of non-professional speakers.<p><p>Recordings were made at a soundproof room of Communication Campus of the Pompeu Fabra University, in Barcelona. The Sony Vegas program, running on a PC with a RME Hammerfall HDSP 9652 soundcard, and a Yamaha 02R96 mixer with ADAT MY16AT cards, were used for recordings, at a sampling frequency of 48 KHz.<p><p>All the recordings were made using two microphones for each speaker: a fixed directional one (AKG C 414 B-ULS) and a headset wireless one (Senheisser EW100-G2).<p><p>Recordings were stored in wav files: mono files for the “News” subcorpus and stereo files, containing in separate channels the speech of the two participants in the conversation (they were recorded using different microphones), for the “Task” and “Free” dialogues.<p><p>The corpus includes the orthographic transcription of the recordings in separate files: txt files, containing only the raw text, in the case of the “News” corpus (these files contain the actual text read by every speaker) and xml files, containing an enriched transcription of the conversations, carried out by human transcribers, following TEI conventions, in the case of “Task” and “Free” dialogues.<p><p>Word-by-word orthographic transcription is also provided in a Praat TextGrid file, timealigned with the signal. This Praat TextGrid file includes also the phonetic transcription of the recordings, timealigned with the speech signal: automatically transcribed from the news texts, automatically aligned and then revised by human experts, in the case of the “News” subcorpus, and automatically transcribed from the orthographical transcriptions of conversations and automatically aligned, in the case of the “Task” and “Free” dialogues subcorpora.<p><p>The phonetic transcription was done using the SAMPA phonetic alphabet.<p><p>The TextGrid file includes also three tiers with the segmentation in syllables, major intonation groups and minor intonation groups: obtained automatically using prosodic annotation tools and then revised by human experts, in the case of the “News” subcorpus, and obtained automatically using prosodic annotation tools, in the case of the “Task” and “Free” dialogues subcorpora.
Glissando-ca 包含超过12小时的加泰罗尼亚语语音数据,采集于最优声学环境中,经正字法转录、语音对齐,并标注了韵律信息(含重读音节位置与韵律分句信息)。该语料库由8名专业发音人与20名非专业发音人录制:其中4名为"新闻播报"类专业发音人(2男2女)、4名为"广告配音"类专业发音人(2男2女),剩余20名为非专业发音人(10男10女)。本语料库专为韵律研究设计,但亦可应用于其他场景。其语料结构与庞大的发音人规模,使得该语料库尤其适用于跨发音人、跨风格的韵律分析。
Glissando-ca 拥有配套的西班牙语对应语料库Glissando-sp,二者结构与特征一致,因此同样适用于跨语言对比研究(详见ELRA-T0406,http://catalog.elra.info/en-us/repository/browse/ELRA-S0406/)。
上述两个语料库均为联合研究项目的成果,参与机构包括庞培法布拉大学(Pompeu Fabra University, UPF)、巴塞罗那自治大学(Autonomous University of Barcelona, UAB)与巴利亚多利德大学(University of Valladolid, UVA)。
Glissando-ca 包含三个子语料库:
1. "新闻"子语料库:包含真实新闻文本的朗读录音(文本由"Cadena Ser"广播电台提供),并根据语料库预设的音段与韵律要求进行了适配调整。该子语料库分为两类:符合韵律标准的"韵律"子库(含36条录音)与符合音段标准的"语音"子库(含36条录音)。录制工作由8名专业发音人完成,其中4人同时录制了"韵律"与"语音"子库,另外4人仅录制了"韵律"子库。每条文本的预设朗读时长约为1分钟,但实际录制时长因发音人而异。
2. "任务型对话"子语料库:包含两组发音人之间针对特定信息查询目标的互动录音。每段对话中,一名发音人扮演指令发布者,另一名扮演指令执行者。共录制三类互动场景:a) 模拟电话对话:运营商与乘客的对话,乘客查询特定航线的票价与时刻表;b) 大学交换项目咨询对话:学校行政人员向学生介绍境外大学课程申请选项,学生提出咨询请求;c) 旅行规划咨询对话:一名发音人扮演计划前往希腊科孚岛旅行的旅行者,致电曾在希腊居住5年的同事,查询该岛屿的行程相关具体信息。对话无固定路线要求,仅设定旅行的起点、终点与途经游览点。该任务由12组发音人完成:1组"新闻播报"专业发音人、1组"广告配音"专业发音人,以及10组非专业发音人。
3. "自由对话"子语料库:包含熟悉彼此的人群之间的对话录音。对话以"你还记得你们是怎么认识的吗?"作为开场,但发音人可在对话中自由切换话题。该类对话由6组发音人录制:1组"新闻播报"专业发音人、1组"广告配音"专业发音人,以及4组非专业发音人。
所有录音均在巴塞罗那庞培法布拉大学传播园区的隔音室内完成。录制设备包括运行于PC端的Sony Vegas软件、搭配RME Hammerfall HDSP 9652声卡,以及搭载ADAT MY16AT卡的Yamaha 02R96混音台,采样率设为48 kHz。
每位发音人同时使用两支麦克风:固定指向式麦克风(AKG C 414 B-ULS)与头戴式无线麦克风(森海塞尔Sennheiser EW100-G2)。
录音以WAV格式存储:"新闻"子语料库采用单声道WAV文件;"任务型对话"与"自由对话"子语料库采用立体声WAV文件,两个声道分别存储对话双方的语音(使用不同麦克风录制)。
语料库包含独立的转录文件:"新闻"子语料库的转录为纯文本TXT文件,仅包含原始朗读文本(即每位发音人朗读的实际文本内容);"任务型对话"与"自由对话"子语料库的转录为遵循TEI(文本编码倡议,Text Encoding Initiative)规范的XML文件,由人工转录员完成丰富化转录。
语料库还提供Praat TextGrid格式的逐词正字法转录,与语音信号时间对齐。该Praat TextGrid文件同时包含语音转录,同样与语音信号时间对齐:对于"新闻"子语料库,语音转录由新闻文本自动生成,经自动对齐后由人工专家审核修正;对于"任务型对话"与"自由对话"子语料库,语音转录由对话的正字法转录自动生成并完成自动对齐。
语音转录采用SAMPA语音字母表。
Praat TextGrid文件还包含三个层级的分割标注:音节、主要语调组与次要语调组。对于"新闻"子语料库,该标注由韵律标注工具自动生成后经人工专家审核修正;对于"任务型对话"与"自由对话"子语料库,该标注仅由韵律标注工具自动生成。
提供机构:
ELG
创建时间:
2022-06-01



