Heaps' law and Heaps functions in tagged texts: Evidences of their linguistic relevance
收藏DataONE2020-02-28 更新2025-06-28 收录
下载链接:
https://search.dataone.org/view/sha256:00b77af87192d7ca9a4ff668121323d7c5fc1c2600d494575ab5c08c76759588
下载链接
链接失效反馈官方服务:
资源简介:
We study the relationship between vocabulary size and text length in a corpus of 75Â literary works in English, authored by six writers, distinguishing between the contributions of three grammatical classes (or ``tags,'' namely, nouns, verbs, and others), and analyze the progressive appearance of new words of each tag along each individual text. We find that, as prescribed by Heaps' law, vocabulary sizes and text lengths follow a well-defined power-law relation. Meanwhile, the appearance of new words in each text does not obey a power law, and is on the whole well described by the average of random shufflings of the text. Deviations from this average, however, are statistically significant and show systematic trends across the corpus. Specifically, we find that the appearance of new words along each text is predominantly retarded with respect to the average of random shufflings. Moreover, different tags add systematically distinct contributions to this tendency, with verbs and others bei...
创建时间:
2025-06-21



