regicid/1gram_lemonde
收藏Hugging Face2025-02-16 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/regicid/1gram_lemonde
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从1944年12月19日到2022年12月31日期间,法国《世界报》档案中所有单词(1grams)的每日出现频率。使用了nltk.RegexpTokenizer进行分词处理。这些频率数据可以通过API和不同编程语言的包装器(如R、Python和Ruby)访问。
This dataset contains the daily frequencies of occurrences of all words (1grams) in the archives of Le Monde, from the creation of the newspaper (December 19, 1944) to December 31, 2022. The tokenizer used is nltk.RegexpTokenizer. These frequencies are accessible more simply via our API, as well as through its wrappers in R, Python, and Ruby.
提供机构:
regicid



