yjkim27/The-Philosophy-Data-Project
收藏Hugging Face2023-12-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/yjkim27/The-Philosophy-Data-Project
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
tags:
- 'llm'
- 'language-modeling'
pretty_name: The-Philosophy-Data-Project
---
# About dataset
[The Philosophy Data Project](https://philosophydata.com/index.html) is a corpus and a set of anaylsis based philosophy texts, totaling over 50 texts and 30 authors, made by Kourosh Alizadeh.
- `school`: Broad categorization of which school of thought each book belongs to. Sometimes, this classification can be vague or depend on interpretation. Thankfully, texts in this corpus are all distinctive examples of respective school of thought, so at leat here they are reasonable.
- `sentence_spacy` and `sentence_str`: Actual sentences from the texts. Note that some sentences are omitted due to its short length, etc. Refer to the next section for details.
Other items are straightforward.
## Other details
For overview on how the data is prepared, and what to make out of the data itself, check [the author's comment on it](https://philosophydata.com/interpret.html).
To actually browse how the data preparation, check out [the official github repository](https://github.com/kcalizadeh/phil_nlp).
提供机构:
yjkim27
原始信息汇总
数据集概述
数据集名称
The Philosophy Data Project
数据集描述
The Philosophy Data Project 是一个包含超过50篇哲学文本和30位作者的语料库及分析集合,由Kourosh Alizadeh创建。
数据集内容
- 学校分类 (
school): 每本书所属的哲学思想流派的广泛分类。分类可能存在模糊性或依赖于解释。 - 句子数据 (
sentence_spacy和sentence_str): 来自文本的实际句子。部分句子因长度过短等原因被省略。
数据集准备
数据准备的具体方法和数据本身的解读,可参考作者的评论。



