Text files from Gutenberg database
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3360391
下载链接
链接失效反馈官方服务:
资源简介:
Text files of different size and structure. More precisely, we selected random data from the Gutenberg dataset.
This artefact contains five different datasets with random text files (i.e. e-books in .txt format) from the Gutenberg database. The datasets that we selected ranged from text files with a total size of 184MB to a set of text files with a total size of 1.7GB.
More precisely, the following datasets can be found in this package:
1. 184MB
2. 357MB
3. 670MB
4. 1GB
5. 1.7GB
In our case, we used this dataset to perform extensive experiments on regarding the performance of a Symmetric Searchable Encryption scheme. However, this dataset can be used to measure the performance of any algorithm that is parsing documents, extracting keywords, creates dictionaries etc.
创建时间:
2020-01-24



