S1 Text - Rank Diversity of Languages: Generic Behavior in Computational Linguistics
收藏Figshare2015-12-03 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/_Rank_Diversity_of_Languages_Generic_Behavior_in_Computational_Linguistics_/1369726
下载链接
链接失效反馈官方服务:
资源简介:
Figure S1. Rank distributions of words according to frequency. [a]: Normalized word frequency fR as a function of the rank k for several languages for books published in the year 2000. The color code for languages is as follows: light blue for French, green for German, yellow for Italian, orange for English, dark blue for Spanish, and red for Russian. [b]: Word frequency fR as a function of the rank k for English and several years, normalized so that the most frequent element has relative frequency one. In the inset, the unnormalized frequency f is shown. Figure S2. Comparison between the different models, Equations S1–S5, and the frequency of rank distribution. We use the data for the year 2000 and all languages under consideration. The logarithm base 10 of the ratio of the observed values and the model is plotted. It can be appreciated that different models fit better in different regions. However there is no model that fits all languages and all regions much better than the others. Figure S3. Rank variations in time of twenty words from three different scales for English. Figure S4. Rank variations in time of twenty words from three different scales for German. Figure S5. Rank variations in time of twenty words from three different scales for French. Figure S6. Rank variations in time of twenty words from three different scales for Italian. Figure S7. Rank variations in time of twenty words from three different scales for Spanish. Figure S8. Rank variations in time of twenty words from three different scales for Russian. Figure S9. Rank variations in time of twenty words from three different scales for our simulated language. Figure S10. Distribution of relative flights for all languages studied. A similar plot as the one presented in Fig. 5 is shown for other languages. The same color coding and details are used. Figure S11. Correlations for relative frequency changes for different languages. Black line shows correlations for simulated language. (PDF)
创建时间:
2015-12-03



