five

Felladrin/pretrain-mental-health-counseling-conversations

收藏
Hugging Face2024-01-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Felladrin/pretrain-mental-health-counseling-conversations
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: openrail source_datasets: - Amod/mental_health_counseling_conversations --- Conversion of [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations) dataset to be used in pretraining. Python code used for conversion: ```python from datasets import load_dataset import pandas import re dataset = load_dataset("Amod/mental_health_counseling_conversations", split="train") def format(columns): return re.sub(r'\s+', ' ', columns["Response"]).strip() text = [format(columns) for columns in dataset] pandas.DataFrame({"text": list(filter(None, text))}).to_csv("train.csv", index=False) ```
提供机构:
Felladrin
原始信息汇总

数据集概述

数据集来源

  • 原始数据集:Amod/mental_health_counseling_conversations

数据集用途

  • 用于预训练(pretraining)

数据转换

  • 使用Python代码将原始数据集转换为适合预训练的格式。

转换代码

python from datasets import load_dataset import pandas import re

dataset = load_dataset("Amod/mental_health_counseling_conversations", split="train")

def format(columns): return re.sub(rs+, , columns["Response"]).strip()

text = [format(columns) for columns in dataset]

pandas.DataFrame({"text": list(filter(None, text))}).to_csv("train.csv", index=False)

二维码
社区交流群
二维码
科研交流群
商业服务