faizmubeen/indian_railways_complaint_tweets
收藏Hugging Face2026-03-27 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/faizmubeen/indian_railways_complaint_tweets
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
size_categories:
- 1K<n<10K
---
This dataset contains roughly 4500 tweets mentioning the Indian Railways. It was created to help researchers and developers train natural language processing models for classifying passenger feedback, analyzing sentiment, and categorizing railway complaints. The data captures genuine passenger experiences including urgent grievances, infrastructure problems, news sharing, and general appreciation.
## Structure of dataset:
The dataset is structured in a simple tabular format with three primary columns:
`Tweet:` The raw text of the user's tweet.
`Label:` A binary classification indicating whether the tweet is a complaint or not.
`Type:` A more granular, multi-class label indicating the specific domain, issue, or topic of the tweet.
## Label Descriptions:
1. Primary Intent (Label column)
This column serves as a high-level binary indicator for the user's intent:
🔴complaint: Tweets where passengers are actively expressing dissatisfaction, reporting a problem, or seeking a resolution from railway authorities. Examples include reporting delayed trains, dirty coaches, rude staff, or booking failures.
🟢non_complaint: Tweets that are informational, appreciative, neutral observations, news sharing, policy updates, or general inquiries without any underlying grievance.
2. Issue Categorization (Type column)
This column categorizes the tweets into specific operational and service domains. While the dataset contains a "long tail" of highly specific user-generated sub-labels, the vast majority of tweets fall into the following core categories:
### Operations & Infrastructure
Punctuality: The most common issue; involves delayed trains, excessive waiting times, and schedule disruptions.
Cleanliness: Issues related to the hygiene of coaches, toilets, platforms, and pest control (e.g., cockroaches, rodents).
Maintenance: Structural or mechanical issues with the train, such as broken windows, leaking roofs, or noisy/damaged berths.
Electrical Equipment: Complaints about non-functional fans, Air Conditioning (AC), lights, or charging sockets inside the train.
Water Availability: Lack of drinking water at stations or empty water tanks in train washrooms.
### Service & Staff
Staff Behaviour: Feedback (usually negative) regarding the attitude, responsiveness, or professionalism of railway personnel (TTEs, RPF, cleaning staff).
Corruption / Bribery: Serious allegations of staff demanding bribes, ticketless travel facilitated by examiners, or ticketing scams.
Service / Facility: General remarks on the quality of service provided, station waiting rooms, platform seating, or the overall journey experience.
### Passenger Experience & Amenities
Security: Concerns regarding passenger safety, theft, unauthorized personnel in reserved coaches, or requests for the Railway Protection Force (RPF).
Crowding: Complaints about severe overcrowding, especially un-ticketed passengers forcefully entering reserved or AC compartments.
Bed Roll: Issues concerning the availability, cleanliness, or missing items in the linen/bedroll provided in AC coaches.
Catering & Vending Services: Feedback on the quality, overcharging, availability, or hygiene of the food and pantry services on the train.
Ticket Booking: Problems faced while using the IRCTC website/app, payment gateway failures, or refund delays.
### General & Informational
Information: Tweets sharing railway news, updates, schedule changes, or users asking basic queries (e.g., PNR status).
Appreciation: Positive feedback thanking the railways, the ministry, or specific staff for a good initiative, quick resolution, or a pleasant journey.
Miscellaneous / Others: Tweets that fall outside standard categories, covering highly specific personal issues, unique requests, or general commentary.
## Potential Use Cases
1. Customer Support Automation: Training conversational AI to automatically route specific complaints (e.g., Electrical Equipment vs. Security) to the correct railway department.
2. Grievance Prioritization: Building triage models to instantly flag urgent, life-safety issues (like Security or Medical emergencies) over routine complaints (like Bed Roll).
3. Public Sentiment & Operations Analysis: Tracking public perception of the Indian Railways service quality over time or identifying recurring bottlenecks in specific zones.
---
license: mit
---
提供机构:
faizmubeen



