five

muhammadravi251001/translated-indo-nli

收藏
Hugging Face2023-02-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/muhammadravi251001/translated-indo-nli
下载链接
链接失效反馈
官方服务:
资源简介:
--- tags: - translated-indonli license: bigscience-openrail-m datasets: - indonli --- On this repository, I just downloaded and processed the `translate_train.tar.gz` file here: `https://github.com/ir-nlp-csui/indonli/tree/main/data` How to use? As simple as this: ```python !wget https://huggingface.co/datasets/muhammadravi251001/translated-indo-nli/raw/main/dev.jsonl !wget https://huggingface.co/datasets/muhammadravi251001/translated-indo-nli/resolve/main/train.jsonl import pandas as pd data_train_translated_indonli = pd.read_json(path_or_buf='train.jsonl', lines=True) data_dev_translated_indonli = pd.read_json(path_or_buf='dev.jsonl', lines=True) ``` Voila~! ## Reference The dataset I used is by IndoNLI. ``` @inproceedings{indonli, title = "IndoNLI: A Natural Language Inference Dataset for Indonesian", author = "Mahendra, Rahmad and Aji, Alham Fikri and Louvan, Samuel and Rahman, Fahrurrozi and Vania, Clara", booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021", publisher = "Association for Computational Linguistics", } ```
提供机构:
muhammadravi251001
原始信息汇总

数据集概述

标签

  • 标签:
    • translated-indonli
  • 许可证:
    • bigscience-openrail-m
  • 数据集:
    • indonli

数据来源

  • 数据文件: translate_train.tar.gz
  • 来源链接: https://github.com/ir-nlp-csui/indonli/tree/main/data

使用方法

  • 下载数据: python !wget https://huggingface.co/datasets/muhammadravi251001/translated-indo-nli/raw/main/dev.jsonl !wget https://huggingface.co/datasets/muhammadravi251001/translated-indo-nli/resolve/main/train.jsonl

  • 加载数据: python import pandas as pd data_train_translated_indonli = pd.read_json(path_or_buf=train.jsonl, lines=True) data_dev_translated_indonli = pd.read_json(path_or_buf=dev.jsonl, lines=True)

参考文献

  • 数据集来源: IndoNLI

  • 参考文献:

    @inproceedings{indonli, title = "IndoNLI: A Natural Language Inference Dataset for Indonesian", author = "Mahendra, Rahmad and Aji, Alham Fikri and Louvan, Samuel and Rahman, Fahrurrozi and Vania, Clara", booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing", month = nov, year = "2021", publisher = "Association for Computational Linguistics", }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作