Deutsches Textarchiv
收藏re3data.org2024-05-31 收录
下载链接:
https://www.re3data.org/repository/r3d100010385
下载链接
链接失效反馈官方服务:
资源简介:
The German Text Archive (Deutsches Textarchiv, DTA) presents online a selection of key German-language works in various disciplines from the 17th to 19th centuries. The electronic full-texts are indexed linguistically and the search facilities tolerate a range of spelling variants. The DTA presents German-language printed works from around 1650 to 1900 as full text and as digital facsimile. The selection of texts was made on the basis of lexicographical criteria and includes scientific or scholarly texts, texts from everyday life, and literary works. The digitalisation was made from the first edition of each work. Using the digital images of these editions, the text was first typed up manually twice (‘double keying’). To represent the structure of the text, the electronic full-text was encoded in conformity with the XML standard TEI P5. The next stages complete the linguistic analysis, i.e. the text is tokenised, lemmatised, and the parts of speech are annotated. The DTA thus presents a linguistically analysed, historical full-text corpus, available for a range of questions in corpus linguistics. Thanks to the interdisciplinary nature of the DTA Corpus, it also offers valuable source-texts for neighbouring disciplines in the humanities, and for scientists, legal scholars and economists.
德国文本档案(德意志文本档案,DTA)在线呈现了17至19世纪各学科领域的关键德语作品选集。电子全文经过语言学索引,搜索功能兼容多种拼写变体。DTA以全文和数字化副本的形式,展示了约1650年至1900年的德语印刷作品。文本选择基于词典学标准,包括科学或学术文本、日常生活文本以及文学作品。数字化过程基于每部作品的首次出版版。利用这些版本的数字化图像,文本首先通过人工两次录入(“双重录入”)。为了呈现文本结构,电子全文按照XML标准TEI P5进行编码。随后阶段完成了语言分析,即文本被分词、词元化,并对词性进行了标注。因此,DTA呈现了一个经过语言学分析的历史全文语料库,可供解决语料库语言学领域的众多问题。得益于DTA语料库的跨学科特性,它还为人文社会科学邻近学科的学者、法律学者和经济学家提供了宝贵的原始文本资料。
提供机构:
DTA



