alaminxpro/university-students-complaints
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/alaminxpro/university-students-complaints
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
language:
- en
pretty_name: University Students Complaints
task_categories:
- text-classification
tags:
- complaints
- university
- students
- nlp
- multi-label-classification
- severity-prediction
- department-routing
- text
- tabular
size_categories:
- n<1K
configs:
- config_name: default
data_files:
- split: train
path: train.csv
- split: test
path: test.csv
---
# University Students Complaints Dataset
A labeled dataset of **332 real-world complaints** collected from university students across multiple departments. Each complaint is annotated with category, severity, responsible departments, and fine-grained aspect labels — making it suitable for multi-label text classification, severity prediction, and department routing tasks.
## Dataset Summary
University students regularly raise concerns about infrastructure, academics, technical facilities, administration, and finances. This dataset captures those complaints in English and provides structured labels that can be used to train models for automated complaint triage, severity assessment, and routing to the correct department.
**Key characteristics:**
- Collected from students across **6 departments** (CSE, EEE, IPE, BBA, LAW, ENG)
- **5 complaint categories**: Infrastructure, Technical, Academic, Administrative, Finance
- **4 severity levels**: Low, Medium, High, Urgent
- **Multi-label annotations** for responsible departments and complaint aspects
- **Paraphrase grouping** via `Complaint_Group_ID` — multiple phrased versions of the same complaint share a group ID, useful for paraphrase detection and data augmentation studies
## Supported Tasks
| Task | Description |
| ------------------------------------- | ----------------------------------------------------------------------------------------------------------- |
| **Complaint Category Classification** | Classify complaints into one of 5 categories (Infrastructure, Technical, Academic, Administrative, Finance) |
| **Severity Prediction** | Predict the urgency level (Low, Medium, High, Urgent) |
| **Department Routing** | Predict which department(s) should handle the complaint |
| **Multi-label Aspect Classification** | Identify which aspects a complaint covers (WiFi, Washroom, Classroom, etc.) |
| **Paraphrase Detection** | Leverage `Complaint_Group_ID` to study paraphrase relationships |
## Data Fields
| Field | Type | Description |
| ------------------------- | ------ | ------------------------------------------------------------------------------------------ |
| `ID` | int | Unique complaint identifier |
| `Timestamp` | string | When the complaint was submitted |
| `Gender` | string | Gender of the student (`Male` / `Female`) |
| `Semester` | float | Academic semester of the student (e.g., `1.2`, `4.1`) |
| `Student_Dept` | string | Department the student belongs to (CSE, EEE, IPE, BBA, LAW, ENG) |
| `Complaint_Description` | string | Free-text complaint written by the student |
| `Category` | string | Complaint category: `Infrastructure`, `Technical`, `Academic`, `Administrative`, `Finance` |
| `Responsible_Departments` | string | Pipe-separated list of departments responsible (e.g., `IT \| Maintenance`) |
| `Primary_Department` | string | The single most relevant department |
| `Aspects` | string | Pipe-separated aspect labels (e.g., `WiFi \| Classroom`) |
| `Severity` | string | Severity level: `Low`, `Medium`, `High`, `Urgent` |
| `Complaint_Group_ID` | int | Groups paraphrased versions of the same complaint |
## Dataset Statistics
### Splits
| Split | Samples |
| --------- | ------- |
| Train | 273 |
| Test | 59 |
| **Total** | **332** |
### Category Distribution
| Category | Count |
| -------------- | ----- |
| Infrastructure | 143 |
| Technical | 65 |
| Academic | 64 |
| Finance | 33 |
| Administrative | 27 |
### Severity Distribution
| Severity | Count |
| -------- | ----- |
| Urgent | 108 |
| Medium | 99 |
| High | 98 |
| Low | 27 |
### Student Department Distribution
| Department | Count |
| ---------- | ----- |
| CSE | 245 |
| EEE | 29 |
| BBA | 15 |
| ENG | 15 |
| IPE | 14 |
| LAW | 14 |
### Top Complaint Aspects
| Aspect | Count |
| ------------- | ----- |
| Electrical/AC | 163 |
| Classroom | 97 |
| Faculty | 37 |
| Washroom | 37 |
| Examination | 35 |
| WiFi | 32 |
| Fee/Payment | 32 |
| Lab Equipment | 28 |
| Projector | 22 |
| Parking | 22 |
### Unique Complaint Groups
There are **186 unique complaint groups** — many original complaints have paraphrased variants, providing a natural resource for studying textual similarity and paraphrase generation.
## Usage
```python
from datasets import load_dataset
dataset = load_dataset("alaminxpro/university-students-complaints")
# Access splits
train = dataset["train"]
test = dataset["test"]
# Example: view first complaint
print(train[0]["Complaint_Description"])
print(train[0]["Category"], train[0]["Severity"])
```
### Working with Multi-label Fields
The `Responsible_Departments` and `Aspects` fields use `" | "` (pipe with spaces) as a separator:
```python
aspects = train[0]["Aspects"].split(" | ")
departments = train[0]["Responsible_Departments"].split(" | ")
```
## Additional Files
- `university_students_complaints.xlsx` — The original dataset in Excel format
## Author
Created by **Md. Al Amin** — [LinkedIn](https://www.linkedin.com/in/alaminxpro/)
## Citation
If you use this dataset in your research or projects, please cite it as:
```bibtex
@dataset{alamin2026university,
title={University Students Complaints Dataset},
author={Md. Al Amin},
year={2026},
url={https://huggingface.co/datasets/alaminxpro/university-students-complaints},
note={ORCID: 0009-0005-5859-3093},
license={CC-BY-4.0}
}
```
## License
This dataset is released under the [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/) license.
提供机构:
alaminxpro



