FeniVerse: A Parallel Corpus of Feni Dialect, Standard Bengali, and English

NIAID Data Ecosystem2026-05-10 收录

下载链接：

https://data.mendeley.com/datasets/25r923frfj

下载链接

链接失效反馈

官方服务：

资源简介：

FeniVerse is an openly accessible trilingual parallel corpus containing 4,094 aligned sentences each in English, Standard Bangla, and the Feni dialect (12,282 total sentences). Each entry is sentence-aligned across the three languages, enabling machine translation, dialect classification, cross-linguistic analysis, and other NLP research. The dataset is provided as a ZIP file named “FeniVerse Parallel Corpus”, which contains two main files: FeniVerse_Dataset.csv – with three columns: English, Standard Bangla, Feni Dialect FeniVerse_Dataset.xlsx – with the same three columns This is the first publicly available dataset for the Feni dialect, offering an authentic, manually curated, and sentence-aligned resource for linguistic and computational experiments.

创建时间：

2025-09-11

5,000+

优质数据集

54 个

任务类型

进入经典数据集