AMBILE_Shah_Jo_Risalo_Labeled
收藏Figshare2025-09-08 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/AMBILE_Shah_Jo_Risalo_Labeled/30073474
下载链接
链接失效反馈官方服务:
资源简介:
AMBILE Shah Jo RisaloDeveloped by:Abdul Majid Bhurgri Institute of Language Engineering (AMBILE), HyderabadUnder the administrative control of the Culture, Tourism, Antiquities & Archives Department, Government of SindhDataset OverviewThe "Shah Jo Risalo" dataset serves as a comprehensive linguistic and literary resource, encompassing 4,767 Sindhi poetic verses drawn from the 30 traditional Surs (sections) of the esteemed magnum opus of Shah Abdul Latif Bhittai. Each poetic verse, written in Sindhi Arabic Perso, is meticulously paired with its Roman Script, Devanagri Script, along with translations in Sindhi, English (translated by Amar Fayaz Buriro), Urdu (translated by Agha Saleem), and Punjabi (translated by Kartar Singh Arsh). This dataset offers valuable insights into the philosophical, spiritual, and cultural dimensions embedded within the poetry, making it an indispensable asset for researchers, linguists, educators, and developers working on Sindhi literature and AI/NLP applications.Dataset FeaturesTotal Verses: 4,767 poetic lines from 30 classical SursLanguage: Clean Sindhi script in Unicode formatFile Format: CSV file titled Bhittaipedia Risalo -(25-08-25).csvCSV StructureThe dataset is organized into the following fields:Row_ID: Unique identifier for each rowMelody Number: Identifier for the melody associated with the verseMelody (سر): Name of the Sur (chapter)Chapter Number: Verse number within the SurChapter (داستان): Subsection within the SurType: Type or category of the verseBait / Vaayi Number: Number associated with the poetic formSindhi Arabic Perso: Original Sindhi poetic verseRoman Script: Sindhi verse in Roman scriptDevanagri Script: Sindhi verse in Devanagri scriptExplanation: Sindhi interpretation of the verseEnglish Translation: Translated by Amar Fayaz BuriroUrdu Translation: Translated by Agha SaleemPunjabi Translation: Translated by Kartar Singh ArshKeywords: Search-optimized terms for easier data retrievalApplicationsThis dataset is a valuable resource for a variety of applications, including but not limited to:Natural Language Processing (NLP) research in Sindhi languageDevelopment of AI-powered Sindhi chatbots and conversational agentsCreation of educational tools for literature learningText-to-Speech (TTS) system trainingVerse classification and sentiment analysis projectsDigital preservation and promotion of Sindhi literary heritageData SourceThe dataset is sourced from the AMBILE Bhittaipedia project, which aims to digitize and preserve the cultural heritage of Sindhi literature.How to UseClone the repository or download the CSV file.Open the CSV file using Python or Excel:import pandas as pd df = pd.read_csv("Bhittaipedia Risalo -(25-08-25).csv") print(df.head()) The dataset is sourced from the AMBILE Bhittaipedia project.LicenseThis dataset is released under the Creative Commons Attribution-NonCommercial 4.0 License. It is intended for educational and research purposes only.AcknowledgmentsSpecial thanks to the AMBILE team for their involvement in data compilation and cleaning.ContactFor any queries, collaboration opportunities, or contributions, please contact:Email: datasets@sindh.ai
创建时间:
2025-09-08



