five

ahnafch01/Bangladesh_Locations

收藏
Hugging Face2025-12-09 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/ahnafch01/Bangladesh_Locations
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en - bn tags: - location - Postal - Bangladesh - Division - District - Upazilla size_categories: - 1K<n<10K --- # Bangladesh Postcodes Dataset (Bilingual & Structured) A comprehensive, cleaned, and bilingual (English & Bangla) database of postal codes in Bangladesh. This dataset covers the full administrative hierarchy: **Division > District > Thana (Upazila) > Post Office**. ## 📂 Files Included | Filename | Format | Description | | :--- | :--- | :--- | | `bangladesh_postcodes_final.csv` | **CSV** | The master dataset with all columns. Best for data analysis or database imports. | | `bangladesh_postcodes_flat.json` | **JSON** | A flat array of objects. Best for Frontend applications (React/Vue). | | `address_lookup_db.json` | **JSON** | **Optimized Hash Map.** Keys are keywords (PostCode, English Name, Bangla Name) pointing to location data. Best for backend search logic. | ## 📊 Data Schema The dataset contains the following columns: | Column Name | Data Type | Description | Example | | :--- | :--- | :--- | :--- | | `Division_ID` | Integer | Administrative ID for the Division. | `1` | | `District_ID` | Integer | Administrative ID for the District. | `1` | | `District_Name` | String | Name of the District (English). | `Dhaka` | | `District_Name_BN` | String | Name of the District (Bangla). | `ঢাকা` | | `Thana_ID` | Integer | Administrative ID for the Police Station/Upazila. | `16` | | `Thana_Name` | String | Name of the Upazila/Thana (English). | `Banani` | | `Thana_Name_BN` | String | Name of the Upazila/Thana (Bangla). | `বনানী` | | `PostOffice_Name` | String | Name of the specific Post Office (English). | `Banani` | | `PostOffice_Name_BN` | String | Name of the specific Post Office (Bangla). | `বনানী` | | `Office_Type` | String | Administrative classification (see below). | `TSO` | | `PostCode` | Integer | The 4-digit postal code. | `1213` | | `Slug` | String | URL-friendly identifier. | `banani-tso-1213` | ### 🏢 Office Types Meaning The `Office_Type` column was extracted to clean the names. * **GPO:** General Post Office (Head of Region) * **HO:** Head Office (Head of District) * **SO:** Sub Office (Upazila Level) * **TSO:** Town Sub Office (City Area) * **EDBO/EDSO:** Extra Departmental Branch Office (Rural/Village Agent-run) ## 🛠 Methodology 1. **Scraping:** Data was gathered hierarchically from publicly available postal data sources. 2. **Cleaning:** * Removed suffixes (e.g., "Banani TSO" > "Banani"). * Extracted Administrative Types into a separate column. 3. **Translation (AI):** * English to Bangla translation was performed using **OpenAI GPT-4o-mini**. * Context-aware translation applied (e.g., "Hat" > "হাট", not "হ্যাট"). * Acronyms like "EPZ", "BAF" were transliterated phonetically. 4. **Validation:** * Regex checks were run to ensure no English characters remained in Bangla columns. * Logic checks to ensure Thanas map correctly to Districts. ## ⚠️ Known Limitations * **AI Translations:** While 99% accurate, minor phonetic errors may exist in the Bangla spellings. * **Dynamic Data:** Postal codes change occasionally. This dataset represents a snapshot as of **December 2025**. ## 📄 License This dataset is provided "as is" for educational and development purposes. The underlying factual data (Post Codes) belongs to the public domain/Bangladesh Post Office. **Generated by:** Asif Ahnaf Chowdhury **Tools Used:** Python, Pandas, OpenAI API.
提供机构:
ahnafch01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作