five

EsferSami/Bangla-Mental-Health-Dataset-V2

收藏
Hugging Face2025-12-10 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/EsferSami/Bangla-Mental-Health-Dataset-V2
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: bn license: cc-by-4.0 tags: - bangla - mental-health - nlp - synthetic - alpaca-format - dataset dataset_info: 10,000 synthetic Bangla mental health conversations in input‑instruction‑output format. Generated from books, newspapers, YouTube shows, blogs and articles. Available in JSONL, CSV, Parquet and TOON. --- # Bangla Mental Health Dataset V2.0 ## 📘 Overview This dataset contains **10,000 synthetic Bangla mental health conversations**, designed for: - LLM fine-tuning - Mental health classification - Mental-health dialogue modeling - Research on Bangla NLP The dataset includes **input–instruction–output** style records suitable for: - Alpaca format - SFT (Supervised Finetuning) - QLoRA, LoRA - Chat-style fine-tuning - 40 Mental helth condition(eg. মানসিক চাপ,আত্মহত্যা,আত্মবিশ্বাস,ভালোবাসাহীনতা,অতিরিক্ত চিন্তা....) --- ## 📚 Source Methodology This dataset is **synthetically generated**, created using: - Research from multiple mental-health books - Newspaper topics and psychological reports - YouTube talk shows about mental health - Articles, blogs, and clinical guidance content > **No real user data or personal information is included.** > Everything is **synthetic** and constructed through text generation + controlled transformations. --- ## 📂 Provided Formats This dataset repo provides **four different formats** so users can choose what works best: | Format | Filename | Purpose | |--------|----------|----------| | JSONL | `data/dataset.jsonl` | Best for LLM finetuning (Alpaca, SFT, QLoRA) | | CSV | `data/dataset.csv` | Good for analysis or simple loading | | Parquet | `data/dataset.parquet` | Efficient, column-optimized format | | TOON | `data/dataset.toon` | Token-optimized binary format for fast LLM training | --- ## 🔧 Loading Examples ### Load JSONL,TOON,CSV ```python from datasets import load_dataset df = load_dataset("EsferSami/Bangla-Mental-Health-Dataset-V2", data_files="data/dataset.jsonl") ### Load TOON df = load_dataset("EsferSami/Bangla-Mental-Health-Dataset-V2", data_files="data/dataset.toon") ### Load CSV df = load_dataset("EsferSami/Bangla-Mental-Health-Dataset-V2", data_files="data/dataset.csv")
提供机构:
EsferSami
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作