five

Jayveersinh-Raj/Gujarati-correct-incorrect-sent

收藏
Hugging Face2023-07-21 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Jayveersinh-Raj/Gujarati-correct-incorrect-sent
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: other task_categories: - text-generation - zero-shot-classification - text2text-generation language: - gu tags: - medical - chemistry - biology - finance - legal - music - art - code - climate pretty_name: Gujarati grammarly size_categories: - 100K<n<1M --- # Description This is an artificially generated list of correct-incorrect sentence pairs for Gujarati. It can be used for sentence or spelling corrections. # Use Case 1. Query correction 2. Prompt correction for language model 3. Zero-shot correction for better translation 4. Zero-shot applications to achieve similar applications for any language while training it on this dataset, as long as your model supports both languages. For example: Training an XLM-Rshared encoder-decoder model or prompt tuning a language model on this dataset and achieving inference in Italian or any other supported language, hence eliminating the need to have or generate such a dataset for the target language.
提供机构:
Jayveersinh-Raj
原始信息汇总

数据集概述

基本信息

  • 许可证: other
  • 任务类别:
    • text-generation
    • zero-shot-classification
    • text2text-generation
  • 语言: gu
  • 标签:
    • medical
    • chemistry
    • biology
    • finance
    • legal
    • music
    • art
    • code
    • climate
  • 名称: Gujarati grammarly
  • 大小: 100K<n<1M

描述

这是一个人工生成的古吉拉特语正确-错误句子对列表,用于句子或拼写校正。

使用案例

  1. 查询校正
  2. 语言模型的提示校正
  3. 零样本校正以改善翻译
  4. 零样本应用,通过在此数据集上训练模型,实现对任何支持语言的类似应用,例如:训练XLM-R共享编码器-解码器模型或在此数据集上对语言模型进行提示调整,并在意大利语或其他支持语言中实现推理,从而消除对目标语言此类数据集的需求。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作