five

alaminxpro/university-students-complaints

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/alaminxpro/university-students-complaints
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 language: - en pretty_name: University Students Complaints task_categories: - text-classification tags: - complaints - university - students - nlp - multi-label-classification - severity-prediction - department-routing - text - tabular size_categories: - n<1K configs: - config_name: default data_files: - split: train path: train.csv - split: test path: test.csv --- # University Students Complaints Dataset A labeled dataset of **332 real-world complaints** collected from university students across multiple departments. Each complaint is annotated with category, severity, responsible departments, and fine-grained aspect labels — making it suitable for multi-label text classification, severity prediction, and department routing tasks. ## Dataset Summary University students regularly raise concerns about infrastructure, academics, technical facilities, administration, and finances. This dataset captures those complaints in English and provides structured labels that can be used to train models for automated complaint triage, severity assessment, and routing to the correct department. **Key characteristics:** - Collected from students across **6 departments** (CSE, EEE, IPE, BBA, LAW, ENG) - **5 complaint categories**: Infrastructure, Technical, Academic, Administrative, Finance - **4 severity levels**: Low, Medium, High, Urgent - **Multi-label annotations** for responsible departments and complaint aspects - **Paraphrase grouping** via `Complaint_Group_ID` — multiple phrased versions of the same complaint share a group ID, useful for paraphrase detection and data augmentation studies ## Supported Tasks | Task | Description | | ------------------------------------- | ----------------------------------------------------------------------------------------------------------- | | **Complaint Category Classification** | Classify complaints into one of 5 categories (Infrastructure, Technical, Academic, Administrative, Finance) | | **Severity Prediction** | Predict the urgency level (Low, Medium, High, Urgent) | | **Department Routing** | Predict which department(s) should handle the complaint | | **Multi-label Aspect Classification** | Identify which aspects a complaint covers (WiFi, Washroom, Classroom, etc.) | | **Paraphrase Detection** | Leverage `Complaint_Group_ID` to study paraphrase relationships | ## Data Fields | Field | Type | Description | | ------------------------- | ------ | ------------------------------------------------------------------------------------------ | | `ID` | int | Unique complaint identifier | | `Timestamp` | string | When the complaint was submitted | | `Gender` | string | Gender of the student (`Male` / `Female`) | | `Semester` | float | Academic semester of the student (e.g., `1.2`, `4.1`) | | `Student_Dept` | string | Department the student belongs to (CSE, EEE, IPE, BBA, LAW, ENG) | | `Complaint_Description` | string | Free-text complaint written by the student | | `Category` | string | Complaint category: `Infrastructure`, `Technical`, `Academic`, `Administrative`, `Finance` | | `Responsible_Departments` | string | Pipe-separated list of departments responsible (e.g., `IT \| Maintenance`) | | `Primary_Department` | string | The single most relevant department | | `Aspects` | string | Pipe-separated aspect labels (e.g., `WiFi \| Classroom`) | | `Severity` | string | Severity level: `Low`, `Medium`, `High`, `Urgent` | | `Complaint_Group_ID` | int | Groups paraphrased versions of the same complaint | ## Dataset Statistics ### Splits | Split | Samples | | --------- | ------- | | Train | 273 | | Test | 59 | | **Total** | **332** | ### Category Distribution | Category | Count | | -------------- | ----- | | Infrastructure | 143 | | Technical | 65 | | Academic | 64 | | Finance | 33 | | Administrative | 27 | ### Severity Distribution | Severity | Count | | -------- | ----- | | Urgent | 108 | | Medium | 99 | | High | 98 | | Low | 27 | ### Student Department Distribution | Department | Count | | ---------- | ----- | | CSE | 245 | | EEE | 29 | | BBA | 15 | | ENG | 15 | | IPE | 14 | | LAW | 14 | ### Top Complaint Aspects | Aspect | Count | | ------------- | ----- | | Electrical/AC | 163 | | Classroom | 97 | | Faculty | 37 | | Washroom | 37 | | Examination | 35 | | WiFi | 32 | | Fee/Payment | 32 | | Lab Equipment | 28 | | Projector | 22 | | Parking | 22 | ### Unique Complaint Groups There are **186 unique complaint groups** — many original complaints have paraphrased variants, providing a natural resource for studying textual similarity and paraphrase generation. ## Usage ```python from datasets import load_dataset dataset = load_dataset("alaminxpro/university-students-complaints") # Access splits train = dataset["train"] test = dataset["test"] # Example: view first complaint print(train[0]["Complaint_Description"]) print(train[0]["Category"], train[0]["Severity"]) ``` ### Working with Multi-label Fields The `Responsible_Departments` and `Aspects` fields use `" | "` (pipe with spaces) as a separator: ```python aspects = train[0]["Aspects"].split(" | ") departments = train[0]["Responsible_Departments"].split(" | ") ``` ## Additional Files - `university_students_complaints.xlsx` — The original dataset in Excel format ## Author Created by **Md. Al Amin** — [LinkedIn](https://www.linkedin.com/in/alaminxpro/) ## Citation If you use this dataset in your research or projects, please cite it as: ```bibtex @dataset{alamin2026university, title={University Students Complaints Dataset}, author={Md. Al Amin}, year={2026}, url={https://huggingface.co/datasets/alaminxpro/university-students-complaints}, note={ORCID: 0009-0005-5859-3093}, license={CC-BY-4.0} } ``` ## License This dataset is released under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/) license.
提供机构:
alaminxpro
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作