five

manu/project_gutenberg

收藏
Hugging Face2023-09-07 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/manu/project_gutenberg
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: id dtype: string - name: text dtype: string splits: - name: de num_bytes: 1070196924 num_examples: 3131 - name: en num_bytes: 25616345280 num_examples: 61340 - name: es num_bytes: 496728508 num_examples: 1202 - name: fr num_bytes: 2338871137 num_examples: 5493 - name: it num_bytes: 383733486 num_examples: 1008 - name: nl num_bytes: 504939551 num_examples: 1420 - name: pl num_bytes: 4864460 num_examples: 34 - name: pt num_bytes: 204058452 num_examples: 1111 - name: ru num_bytes: 943593 num_examples: 6 - name: sv num_bytes: 116664385 num_examples: 388 - name: zh num_bytes: 174238359 num_examples: 437 download_size: 14399256761 dataset_size: 30911584135 task_categories: - text-generation language: - fr - en - zh - pt - pl - nl - ru - sv - it - de - es pretty_name: Project Gutenberg size_categories: - 10K<n<100K --- # Dataset Card for "Project Gutenberg" Project Gutenberg is a library of over 70,000 free eBooks, hosted at https://www.gutenberg.org/. All examples correspond to a single book, and contain a header and a footer of a few lines (delimited by a *** Start of *** and *** End of *** tags). ### Usage ```python from datasets import load_dataset ds = load_dataset("manu/project_gutenberg", split="fr", streaming=True) print(next(iter(ds))) ``` ### License Full license is available here: https://www.gutenberg.org/policy/license.html #### Summary For nearly all uses, in nearly all parts of the world, the opening words of all of our eBooks apply: This eBook is for the use of anyone anywhere in the United States and most other parts of the world at no cost and with almost no restrictions whatsoever. You may copy it, give it away or re-use it under the terms of the Project Gutenberg License included with this eBook or online at [www.gutenberg.org]. If you are not located in the United States, you’ll have to check the laws of the country where you are located before using this ebook.” ##### Using the Project Gutenberg Trademark If you want to use the name Project Gutenberg anywhere in the ebooks you distribute or on the distribution medium or in advertising you have to obey these rules: - you may only distribute verbatim copies of the ebooks. No changes are allowed to the ebook contents. (Though reformatting the ebook to a different file format is considered okay). - If you charge money for the copies you distribute, you have to pay royalties to Project Gutenberg. - You must refund your clients for defective copies or if they don’t agree with the Project Gutenberg license. If you don’t agree with any of the above mentioned restrictions, you may not use the Project Gutenberg trademark. You may still distribute the ebooks if you strip the Project Gutenberg license and all references to Project Gutenberg.
提供机构:
manu
原始信息汇总

数据集概述

数据集来源

  • 名称: Project Gutenberg
  • 网址: https://www.gutenberg.org/

数据集内容

  • 类型: 电子书籍
  • 数量: 超过70,000本

数据集特点

  • 免费: 所有书籍均为免费提供
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作