fractalego/wafl-functions-dataset
收藏Hugging Face2024-05-15 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/fractalego/wafl-functions-dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
- split: test_with_delayed_generation
path: data/test_with_delayed_generation-*
dataset_info:
features:
- name: memory
dtype: string
- name: rules
dtype: string
- name: positive_conversation
dtype: string
- name: negative_conversation
dtype: string
splits:
- name: train
num_bytes: 4428113
num_examples: 981
- name: test
num_bytes: 40845
num_examples: 33
- name: test_with_delayed_generation
num_bytes: 49924
num_examples: 40
download_size: 2118259
dataset_size: 4518882
---
# Dataset Card for "wafl-functions-dataset"
This is an instruction dataset for fine-tuning in DPO.
The dataset consists of 981 training items and 33 test instances.
Each row in the dataset includes a column for facts, one for rules, another for positive examples of dialogue, as well as examples of dialogues to discard.
These components are concatenated to construct a prompt structure as follows:
```python
Here is a synopsis of the bot's knowledge:
{memory}
The regulations are as follows:
{rules}
The dialogue proceeds as follows:
{conversation}
```
The *memory* cell contains a collection of facts extracted from the knowledge base.
These facts are straightforward sentences containing - for instance - the assistant's name.
The content within the *memory* portion of the prompt resembles typical Retrieval-Augmented Generation.
The *rules* variable comprises a series of nested instructions for the assistant's conduct - sourced from the same knowledge base as the facts.
Lastly, *conversation* denotes a sequence of alternating remarks between the assistant and the user, supplied in the dataset as both positives and negative instances.
### Generation of the Dataset
To ensure diversity in subject matter, each item is conditioned to adhere to a randomly chosen excerpt from the [Ultrachat Dataset](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k).
Each candidate has undergone manual scrutiny, receiving corrections or complete rewrites if it fails to conform to the format pertinent to the WAFL assistant.
## Results
These are the results obtained up to now according to the metric in https://github.com/fractalego/wafl_llm_eval
| LLM Name | Precision | Recall | F1 |
|----------------------------------------|-----------|----------|----------|
| Phi-3-mini-4k-instruct (original) | 1 | 0.92 | 0.96 |
| Mistral-7B-Instruct-v0.1 (original) | 1 | 0.47 | 0.64 |
| Meta-Llama-3-8B-Instruct (original) | 1 | 0.76 | 0.87 |
| Phi-3-mini-4k-instruct (after DPO) | 1 | **0.95** | **0.97** |
| Mistral-7B-Instruct-v0.1 (after DPO) | 0.93 | 0.73 | 0.82 |`
| Meta-Llama-3-8B-Instruct (after DPO) | 0.91 | 0.87 | 0.89 |`
提供机构:
fractalego
原始信息汇总
数据集概述
数据集名称
"wafl-functions-dataset"
数据集配置
- 默认配置
- 训练数据:路径为
data/train-* - 测试数据:路径为
data/test-* - 延迟生成测试数据:路径为
data/test_with_delayed_generation-*
- 训练数据:路径为
数据集特征
- memory:字符串类型
- rules:字符串类型
- positive_conversation:字符串类型
- negative_conversation:字符串类型
数据集分割
- 训练集
- 大小:4428113字节
- 示例数:981
- 测试集
- 大小:40845字节
- 示例数:33
- 延迟生成测试集
- 大小:49924字节
- 示例数:40
数据集大小
- 下载大小:2118259字节
- 数据集总大小:4518882字节
数据集结构
每条数据包括以下内容:
- memory:知识库中提取的事实集合
- rules:助手行为的嵌套指令
- conversation:助手与用户之间的对话,包括正面和负面实例
数据集生成
- 每个数据项基于随机选择的Ultrachat Dataset片段生成,并经过人工审核和格式校正。
性能评估
根据wafl_llm_eval的评估结果,不同模型在DPO后的性能如下:
| LLM名称 | 精确度 | 召回率 | F1分数 |
|---|---|---|---|
| Phi-3-mini-4k-instruct (after DPO) | 1 | 0.95 | 0.97 |
| Mistral-7B-Instruct-v0.1 (after DPO) | 0.93 | 0.73 | 0.82 |
| Meta-Llama-3-8B-Instruct (after DPO) | 0.91 | 0.87 | 0.89 |



