PsychologicalReasoning-15k

Name: PsychologicalReasoning-15k
Creator: maas
Published: 2025-12-05 16:54:43
License: 暂无描述

魔搭社区2025-12-05 更新2025-11-03 收录

下载链接：

https://modelscope.cn/datasets/gustavecortal/PsychologicalReasoning-15k

下载链接

链接失效反馈

官方服务：

资源简介：

## Methodology We perform domain filtering on [Dolphin R1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1) and [General Reasoning](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K). Prompts are embedded, clustered with k-means (k=20 000) and majority-voted for domain labels using [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), following the [Intelligent Internet pipeline](https://huggingface.co/Intelligent-Internet/II-Medical-8B-1706). Clusters tagged psychology or philosophy were retained for LoRA finetuning (rank=8, alpha=16, max length=2048, epoch=1, batch size=16). This work was performed using HPC resources (Jean Zay supercomputer) from GENCI-IDRIS (Grant 20XX-AD011014205). ## Piaget Piaget, a language model based on Qwen3 and finetuned on PsychologicalReasoning-15k. Available sizes are: [0.6B](https://huggingface.co/gustavecortal/Piaget-0.6B), [1.7B](https://huggingface.co/gustavecortal/Piaget-1.7B), [4B](https://huggingface.co/gustavecortal/Piaget-4B), [8B](https://huggingface.co/gustavecortal/Piaget-8B). Piaget aims to reason about psychological and philosophical concepts such as self-image, emotion, and existence. Piaget was inspired by my position paper on emotion analysis: [Improving Language Models for Emotion Analysis: Insights from Cognitive Science](https://aclanthology.org/2024.cmcl-1.23/). ## Contact Mail: gustave.cortal@ens-paris-saclay.fr Website: [gustavecortal.com](gustavecortal.com)

## 研究方法我们对[Dolphin R1](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1)与[通用推理（General Reasoning）](https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K)数据集执行领域过滤操作。我们对提示词进行嵌入处理，采用k-means聚类算法（聚类数k=20000），并借助[Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B)通过多数投票方式生成领域标签，整个流程遵循[智能互联网流水线（Intelligent Internet pipeline）](https://huggingface.co/Intelligent-Internet/II-Medical-8B-1706)的规范。我们保留了被标记为心理学或哲学领域的聚类结果，用于低秩适配（LoRA）微调，微调参数设置为：秩（rank）=8，缩放因子alpha=16，最大序列长度=2048，训练轮次（epoch）=1，批次大小（batch size）=16。本研究依托GENCI-IDRIS提供的高性能计算（HPC）资源——Jean Zay超级计算机（项目编号：20XX-AD011014205）。 ## Piaget Piaget是一款基于Qwen3构建的大语言模型，其微调数据集为PsychologicalReasoning-15k。该模型提供多种参数量版本：[0.6B](https://huggingface.co/gustavecortal/Piaget-0.6B)、[1.7B](https://huggingface.co/gustavecortal/Piaget-1.7B)、[4B](https://huggingface.co/gustavecortal/Piaget-4B)、[8B](https://huggingface.co/gustavecortal/Piaget-8B)。 Piaget的研发目标是实现对心理学与哲学概念的推理，例如自我意象、情感与存在性等议题。本模型的研发灵感源自作者一篇关于情感分析的立场论文：[《面向情感分析的大语言模型优化：认知科学视角》（Improving Language Models for Emotion Analysis: Insights from Cognitive Science）](https://aclanthology.org/2024.cmcl-1.23/)。 ## 联系方式电子邮箱：gustave.cortal@ens-paris-saclay.fr 个人网站：[gustavecortal.com](https://gustavecortal.com)

提供机构：

maas

创建时间：

2025-10-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集