hieunguyen1053/en-to-vi-formal-informal-tranlations
收藏Hugging Face2023-10-30 更新2024-06-15 收录
下载链接:
https://hf-mirror.com/datasets/hieunguyen1053/en-to-vi-formal-informal-tranlations
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: en
dtype: string
- name: vi
dtype: string
- name: fewshot_samples
list:
- name: en
dtype: string
- name: vi
dtype: string
splits:
- name: val
num_bytes: 178154
num_examples: 160
- name: test
num_bytes: 175339
num_examples: 160
download_size: 124988
dataset_size: 353493
---
# Few-shot Translation
## Install
To install `lm-eval` from the github repository main branch, run:
```bash
git clone https://github.com/hieunguyen1053/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .
```
## Basic Usage
> **Note**: When reporting results from eval harness, please include the task versions (shown in `results["versions"]`) for reproducibility. This allows bug fixes to tasks while also ensuring that previously reported scores are reproducible. See the [Task Versioning](#task-versioning) section for more info.
### Hugging Face `transformers`
To evaluate a model hosted on the [HuggingFace Hub](https://huggingface.co/models) (e.g. vlsp-2023-vllm/hoa-1b4) on `hellaswag_vi` you can use the following command:
```bash
python main.py \
--model hf-causal \
--model_args pretrained=vlsp-2023-vllm/hoa-1b4 \
--tasks translation_vi \
--batch_size auto \
--device cuda:0
```
Additional arguments can be provided to the model constructor using the `--model_args` flag. Most notably, this supports the common practice of using the `revisions` feature on the Hub to store partially trained checkpoints, or to specify the datatype for running a model:
```bash
python main.py \
--model hf-causal \
--model_args pretrained=vlsp-2023-vllm/hoa-1b4,revision=step100000,dtype="float" \
--tasks translation_vi \
--device cuda:0
```
To evaluate models that are loaded via `AutoSeq2SeqLM` in Huggingface, you instead use `hf-seq2seq`. *To evaluate (causal) models across multiple GPUs, use `--model hf-causal-experimental`*
> **Warning**: Choosing the wrong model may result in erroneous outputs despite not erroring.
提供机构:
hieunguyen1053
原始信息汇总
数据集概述
数据集信息
特征
- en: 数据类型为
string - vi: 数据类型为
string - fewshot_samples: 包含以下子特征
- en: 数据类型为
string - vi: 数据类型为
string
- en: 数据类型为
分割
- val:
- 字节数: 178154
- 样本数: 160
- test:
- 字节数: 175339
- 样本数: 160
大小
- 下载大小: 124988 字节
- 数据集大小: 353493 字节



