suryaadhi/ppmb-qa-dataset
收藏Hugging Face2025-12-13 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/suryaadhi/ppmb-qa-dataset
下载链接
链接失效反馈官方服务:
资源简介:
PPMB QA数据集是一个多语言问答数据集,专门为大学招生帮助台和聊天机器人应用设计。它聚焦于印度尼西亚UPN Veteran Jawa Timur大学2025/2026学年的新生招生(Penerimaan Mahasiswa Baru / PMB)过程。数据集包含1,059个问答对,每种语言(英语、印尼语、爪哇语)各353个条目。内容涵盖大学招生的各个方面,包括招生途径、专业信息、学费与财务援助、特殊项目、行政信息以及职业成果。数据集采用JSONL格式,每个条目包含一个自然语言查询和一个详细的回答。该数据集适用于聊天机器人开发、多语言NLP研究、信息检索、机器翻译、帮助台自动化以及大型语言模型的微调。其独特之处包括三语平行数据、低资源语言(爪哇语)的包含、真实世界非正式查询、全面的领域覆盖、最新信息以及机构特异性。
The PPMB QA Dataset is a multilingual question-answering dataset specifically designed for university admission helpdesk and chatbot applications. It focuses on the New Student Admission (Penerimaan Mahasiswa Baru / PMB) process at UPN Veteran Jawa Timur, Indonesia, for the academic year 2025/2026. The dataset contains 1,059 QA pairs, with 353 entries each in English, Indonesian, and Javanese. The content covers various aspects of university admission, including admission pathways, program-specific information, tuition and financial aid, special programs, administrative information, and career outcomes. The dataset is structured in JSONL format, with each entry containing a natural language query and a detailed answer. It is suitable for chatbot development, multilingual NLP research, information retrieval, machine translation, helpdesk automation, and fine-tuning LLMs. Unique features include trilingual parallel data, low-resource language (Javanese) inclusion, real-world informal queries, comprehensive domain coverage, up-to-date information, and institutional specificity.
提供机构:
suryaadhi



