The TRANSCOMP Dataset of Literary Translations from 120 Languages and a Parallel Collection of English-language Originals

DataONE2022-10-17 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/sha256:5e53423ead3f998e65bd6b206587b4679ae5f0b364002c4dcf25be34815bff52

下载链接

链接失效反馈

官方服务：

资源简介：

The TRANSCOMP Dataset of Literary Translations is a collection of document-level word frequencies sampled from over 10,000 translations into English of global literary fiction published since 1950, together with a historically matched parallel corpus of equal size that contains fiction originally published in English. The dataset was derived from the NovelTM dataset of English-language fiction, which identifies ca. 176,000 volumes of fiction located in the HathiTrust Digital Library published since the eighteenth century (Underwood). Although these volumes are subject to copyright restrictions, we are able to provide CSV files with word frequency counts for 10,000-word samples taken from each text. The associated metadata is available in a separate CSV. These data are of interest to both literary scholars and linguists working in the field of translation studies.

创建时间：

2023-11-08