facebook/toolverifier
收藏Hugging Face2024-03-14 更新2024-05-25 收录
下载链接:
https://hf-mirror.com/datasets/facebook/toolverifier
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
- config_name: train
features:
- name: text
dtype: string
- config_name: tools
features:
- name: Name
dtype: string
- name: Description
dtype: string
configs:
- config_name: train
data_files:
- split: train
path: train.csv
- config_name: tools
data_files:
- split: tools
path: tools.csv
---
# TOOLVERIFIER: Generalization to New Tools via Self-Verification
This repository contains the ToolSelect dataset which was used to fine-tune Llama-2 70B for tool selection.
## Data
**ToolSelect** data is synthetic training data generated for tool selection task using Llama-2 70B and Llama-2-Chat-70B.
It consists of 555 samples corresponding to 173 tools.
Each training sample is composed of a user instruction, a candidate set of tools that includes the
ground truth tool, and a reasoning note elucidating the correct choice of tool.
For example,
```
User: 10 reviews for "Harry Potter and the Philosopher's Stone".
###
Tool Choices: Bank Account Number generator = The Bank Account Number tool generates a random bank account number for a bank.
Train Ticket Purchase = The Train Ticket Purchase tool gets train tickets for a train, provided stations, and date information.
The Cricket Score tool = The Cricket Score tool gets the score for a match that is happening at a given location and date.
Book Review = The Book Review tool gets the top-rated book reviews for a particular book.
Grocery Shopping with discounts = The Grocery Shopping tool calculates the expense of grocery shopping based on the menu.
Stock Price at location and date = The Stock Price tool gets the stock price for a company at a given location and date.
Movie Recommendation = The Movie Recommendation tool recommends movies based on the user's preferences.
News = The News tool gets the top news for a particular topic or query.
###
Thought: I recommended the Book Review tool because the user asked for reviews of a specific book, "Harry Potter and the Philosopher's Stone". The Book Review tool is designed to provide top-rated book reviews for a particular book, making it the most suitable tool for this request. It will retrieve the top 10 reviews for the book, as requested by the user.
Act: CALLTOOL["Book Review"]
```
### Files
The `data/` folder has 2 files:
* `train.csv` - this file contains the training samples.
* `tools.csv` - this file contains names and descriptions of the generated synthetic tools.
To learn more about the data generation procedure, we direct readers to section 2.1 of our paper.
Paper: https://arxiv.org/abs/2402.14158
## Citation
```
@article{mekala2024toolverifier,
title={TOOLVERIFIER: Generalization to New Tools via Self-Verification},
author={Mekala, Dheeraj and Weston, Jason and Lanchantin, Jack and Raileanu, Roberta and Lomeli, Maria and Shang, Jingbo and Dwivedi-Yu, Jane},
journal={arXiv preprint arXiv:2402.14158},
year={2024}
}
```
## Licensing
See our LICENSE file for licensing details.
提供机构:
facebook
原始信息汇总
数据集概述
数据集名称
ToolSelect
数据集描述
ToolSelect 数据集是为工具选择任务生成的合成训练数据,使用 Llama-2 70B 和 Llama-2-Chat-70B 生成。该数据集包含555个样本,对应于173种工具。每个训练样本包括用户指令、包含正确工具的候选工具集以及解释正确工具选择的推理笔记。
数据集结构
配置名称: train
- 特征:
text: 数据类型为字符串,包含用户指令、候选工具集和推理笔记。
配置名称: tools
- 特征:
Name: 数据类型为字符串,表示工具的名称。Description: 数据类型为字符串,描述工具的功能。
数据文件
train.csv: 包含训练样本的文件。tools.csv: 包含合成工具的名称和描述的文件。
数据示例
User: 10 reviews for "Harry Potter and the Philosophers Stone".
Tool Choices: ...
Thought: ... Act: CALLTOOL["Book Review"]
数据集使用
数据集用于微调 Llama-2 70B 模型以进行工具选择。



