FeniVerse: A Parallel Corpus of Feni Dialect, Standard Bengali, and English
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/25r923frfj
下载链接
链接失效反馈官方服务:
资源简介:
FeniVerse is an openly accessible trilingual parallel corpus containing 4,094 aligned sentences each in English, Standard Bangla, and the Feni dialect (12,282 total sentences). Each entry is sentence-aligned across the three languages, enabling machine translation, dialect classification, cross-linguistic analysis, and other NLP research.
The dataset is provided as a ZIP file named “FeniVerse Parallel Corpus”, which contains two main files:
FeniVerse_Dataset.csv – with three columns: English, Standard Bangla, Feni Dialect
FeniVerse_Dataset.xlsx – with the same three columns
This is the first publicly available dataset for the Feni dialect, offering an authentic, manually curated, and sentence-aligned resource for linguistic and computational experiments.
创建时间:
2025-09-11



