Felladrin/pretrain-mental-health-counseling-conversations
收藏Hugging Face2024-01-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Felladrin/pretrain-mental-health-counseling-conversations
下载链接
链接失效反馈官方服务:
资源简介:
---
license: openrail
source_datasets:
- Amod/mental_health_counseling_conversations
---
Conversion of [Amod/mental_health_counseling_conversations](https://huggingface.co/datasets/Amod/mental_health_counseling_conversations) dataset to be used in pretraining.
Python code used for conversion:
```python
from datasets import load_dataset
import pandas
import re
dataset = load_dataset("Amod/mental_health_counseling_conversations", split="train")
def format(columns):
return re.sub(r'\s+', ' ', columns["Response"]).strip()
text = [format(columns) for columns in dataset]
pandas.DataFrame({"text": list(filter(None, text))}).to_csv("train.csv", index=False)
```
提供机构:
Felladrin
原始信息汇总
数据集概述
数据集来源
- 原始数据集:Amod/mental_health_counseling_conversations
数据集用途
- 用于预训练(pretraining)
数据转换
- 使用Python代码将原始数据集转换为适合预训练的格式。
转换代码
python from datasets import load_dataset import pandas import re
dataset = load_dataset("Amod/mental_health_counseling_conversations", split="train")
def format(columns): return re.sub(rs+, , columns["Response"]).strip()
text = [format(columns) for columns in dataset]
pandas.DataFrame({"text": list(filter(None, text))}).to_csv("train.csv", index=False)



