Alphabetical comprehensive false friends lists and top 100 frequency lists in German-English & Chinese-Japanese
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11123527
下载链接
链接失效反馈官方服务:
资源简介:
The FF pairs in German and English were sourced from "False Friends: A Short Dictionary: Reclam premium Sprachtraining" by Burkhard Dretzke and Margaret Nester. I systematically compiled a list of all 794 pairs of FFs from the German-English dictionary, initially arranging them alphabetically in spreadsheets. Subsequently, I queried the frequency of each entry in both German and English corpora, sorting the word lists in each language in descending order based on frequency per million. I aimed to compile a list of the most relevant FFs by selecting those pairs present in the top 100 entries of both lists, resulting in 34 pairs that appeared prominently in both languages. Additionally, I collected data on FFs where one language exhibited high frequency although the other showed low or minimal occurrence. Excluding the aforementioned 34 pairs, I further filtered the German and English frequency lists to identify 33 words each, along with their corresponding FFs in the other language, yielding a total of 66 additional FF pairs. Thus, I compiled a comprehensive list of the top 100 FFs in German and English based on combined frequency of occurrence. Meanwhile, the Chinese and Japanese FFs were gathered from "2136 Japanese Kanji Character Dictionary" and published by Liaoning People's Publishing House. The methodology for collecting data on Chinese-Japanese FF pairs follows a similar procedure but with greater difficulty and complexity. Given that the concept and phenomenon of FFs originate from European linguistics, there is a relative scarcity of corresponding academic materials in Asian languages such as Chinese and Japanese, for instance, a readily available FFs' dictionary. Consequently, I used a Japanese kanji dictionary as my primary data source. This dictionary allows for the retrieval of Japanese kanji words using 2136 commonly used Chinese characters, selected by the Japanese Ministry of Culture for their high frequency of usage in social life and officially announced in November 2010. With nearly 15,000 entries, each word in the dictionary is accompanied by detailed Chinese interpretations and examples, facilitating the differentiation between true cognates and FFs with the same form. Following a meticulous examination of almost 15,000 entries, I identified 700 pairs of Chinese-Japanese FFs. Then I queried the frequency of each entry in both Chinese and Japanese corpora, sorting the word lists in each language in descending order based on frequencies. Subsequently, I selected 48 pairs that appeared in the top 100 frequency list in both Chinese and Japanese, and 26 words for Chinese and 26 words for Japanese on their own, akin to the methodology employed for German-English data collection, ultimately compiling a comprehensive list of the top 100 FFs in Chinese and Japanese based on combined frequency of occurrence.
创建时间:
2024-05-07



