five

SHSK0118/HFTransformersT5Translation_jp-en-ru

收藏
Hugging Face2026-03-01 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/SHSK0118/HFTransformersT5Translation_jp-en-ru
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - translation language: - ja - en - ru --- <h1>HFTransformersT5Translation_jp-en-ru</h1> <p> Multilingual Neural Machine Translation using T5 (Hugging Face Transformers) for Japanese, English, and Russian. </p> <hr> <h2>Model Description</h2> <p> This repository provides a multilingual neural machine translation system built using T5 from <a href="https://huggingface.co/docs/transformers/index">Hugging Face Transformers</a>. </p> <p> The system supports translation between: </p> <ul> <li>Japanese (ja)</li> <li>English (en)</li> <li>Russian (ru)</li> </ul> <hr> <h2>Supported Language Pairs</h2> <ul> <li>ja ↔ en</li> <li>en ↔ ru</li> <li>ja ↔ ru</li> </ul> <hr> <h2>Training Data</h2> <p> The training corpus is publicly available at: </p> <p> <a href="https://huggingface.co/datasets/SHSK0118/HFTransformersT5Translation_jp-en-ru"> HFTransformersT5Translation_jp-en-ru Dataset </a> </p> <p> Each corpus is split into: </p> <ul> <li>Train (90%)</li> <li>Validation (5%)</li> <li>Test (5%)</li> </ul> <hr> <h2>Training Procedure</h2> <ul> <li>Model: T5 (Transformers)</li> <li>Tokenizer: T5Tokenizer</li> <li>Loss: Cross-entropy</li> <li>Evaluation Metrics: BLEU, RIBES</li> </ul> <hr> <h2>Evaluation</h2> <p> Evaluation is performed using corpus-level: </p> <ul> <li>BLEU</li> <li>RIBES</li> </ul> <hr> <h2>Usage</h2> <pre> python src/train.py python src/t5_test.py python src/translate.py </pre> <hr> <h2>Limitations</h2> <ul> <li>Domain-specific corpus</li> <li>Performance varies across language pairs</li> <li>Japanese–Russian pair contains fewer training samples</li> </ul> <hr> <h2>Intended Use</h2> <p> This implementation is intended for: </p> <ul> <li>Educational purposes</li> <li>Research prototyping</li> <li>Multilingual translation experiments</li> </ul> <hr> <h2>Author</h2> <p> Independent implementation by Shota Tokunaga. </p>
提供机构:
SHSK0118
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作