Pitt Corpus

Name: Pitt Corpus
Creator: Pitt Corpus
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/davidecolla/semantic_coherence_markers

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为Pitt Corpus，包含了针对“饼干盗窃”刺激图片的回应转录文本，样本来源于被诊断为痴呆症的患者和健康对照组。该数据集被用于分析语言模型，并涉及使用困惑度分数对受试者进行分类。规模上，数据集包含了194名痴呆症患者和99名健康对照者的转录文本。研究任务是基于语言分析，区分健康受试者与阿尔茨海默病患者的转录文本。

The dataset, designated as Pitt Corpus, comprises transcribed responses to the "cookie theft" stimulus image, with samples collected from both patients diagnosed with dementia and healthy control subjects. It is employed for language model analysis, and utilizes perplexity scores to classify study participants. The dataset includes transcribed texts from 194 dementia patients and 99 healthy control subjects. The core research task is to distinguish between transcribed linguistic samples from healthy individuals and those with Alzheimer's disease through linguistic analysis.

提供机构：

Pitt Corpus

搜集汇总

数据集介绍

背景与挑战

背景概述

Pitt Corpus数据集专注于语义连贯性标记的研究，包含三个实验，分别处理访谈、集会演讲和痴呆症患者的语言数据。用户需下载预训练模型并按照特定结构组织数据以进行模型微调和评估。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集