hanhainebula/BGE-Benchmark-Examples

Name: hanhainebula/BGE-Benchmark-Examples
Creator: hanhainebula
Published: 2024-03-19 03:02:39
License: 暂无描述

Hugging Face2024-03-19 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/hanhainebula/BGE-Benchmark-Examples

下载链接

链接失效反馈

官方服务：

资源简介：

# BGE Benchmark Examples ## Data Format - `special_examples.jsonl`: each domain (wiki, news) for each language (en, zh) has 3 examples, 12 examples in total. Each example has the following format: ```python { "domain": str, "language": str, "text": str, "characters": List[str], "sampled_characters": List[{ "character": str, "scenarios": List[str], "sampled_scenarios": List[{ "scenario": str, "result": { "prompt1-1": List[{ "query": str, "diversify": { "informal": str, "complicated": str, }, "hard_negative": str, }], "prompt2-1": ..., "prompt3-1": ..., "prompt4-1": ..., "prompt1-2": ..., "prompt2-2": ..., "prompt3-2": ..., "prompt4-2": ..., }, }], }], } ``` - `wiki-news_en-zh_200.jsonl`: each domain (wiki, news) for each language (en, zh) has 50 examples, 200 examples in total. Each example has the following format: ```python {'query': str, 'positive': str, 'hard_negative': str} ``` ## Method LLM: `gpt-4-turbo-preview` For each text → Generate n1 characters → Sample n2 characters ⇒ For each character → Generate n3 scenarios → Sample n4 scenarios ⇒ For each scenario → Generate n5 queries for each prompt (*8 kinds of prompts*) ⇒ For each query → Diversify it in 2 methods & Generate 1 hard negative ⇒ DONE for one example. There are some human-designed characters and scenarios used when generating queries: ```python HUMAN_WRITTEN_CHARACTERS = [ { 'character': 'Professor', 'scenarios': [ 'Setupping questions for an upcoming quiz/examination', 'Preparing for a lecture', ], }, { 'character': 'College student', 'scenarios': [ 'Preparing for an examination', 'Writing a research paper', ], }, { 'character': 'High school student', 'scenarios': [ 'Learning new knowledge', 'Preparing for a presentation', ], } ] ``` For examples in `special_examples.jsonl`, we set n1 = 10 + 3, n2 = 2, n3 = 10 ( + 2), n4 = 2, n5 = 2. For examples in `wiki-news_en-zh_200.jsonl`, we set n1 = 10 + 3, n2 = 1, n3 = 10 ( + 2), n4 = 1, n5 = 3. To generate example in a faster way, we randomly choose one prompt from the 8 kinds of prompts, instead of genrating n5 queries for each prompt (*8 kinds of prompts*).

提供机构：

hanhainebula

原始信息汇总

BGE Benchmark Examples 数据集概述

数据格式

`special_examples.jsonl`

每个领域（wiki, news）和每种语言（en, zh）各有3个示例，总共12个示例。
每个示例的格式如下： python { "domain": str, "language": str, "text": str, "characters": List[str], "sampled_characters": List[{ "character": str, "scenarios": List[str], "sampled_scenarios": List[{ "scenario": str, "result": { "prompt1-1": List[{ "query": str, "diversify": { "informal": str, "complicated": str, }, "hard_negative": str, }], "prompt2-1": ..., "prompt3-1": ..., "prompt4-1": ..., "prompt1-2": ..., "prompt2-2": ..., "prompt3-2": ..., "prompt4-2": ..., }, }], }], }

`wiki-news_en-zh_200.jsonl`

每个领域（wiki, news）和每种语言（en, zh）各有50个示例，总共200个示例。
每个示例的格式如下： python {query: str, positive: str, hard_negative: str}

生成方法

使用 gpt-4-turbo-preview 模型。
生成过程包括：
1. 对每个文本生成 n1 个角色。
2. 从 n1 个角色中采样 n2 个角色。
3. 对每个角色生成 n3 个场景。
4. 从 n3 个场景中采样 n4 个场景。
5. 对每个场景生成 n5 个查询（针对8种提示）。
6. 对每个查询进行两种方式的多样化处理并生成1个硬负例。

人工设计的角色和场景

python HUMAN_WRITTEN_CHARACTERS = [ { character: Professor, scenarios: [ Setupping questions for an upcoming quiz/examination, Preparing for a lecture, ], }, { character: College student, scenarios: [ Preparing for an examination, Writing a research paper, ], }, { character: High school student, scenarios: [ Learning new knowledge, Preparing for a presentation, ], } ]

参数设置

对于 special_examples.jsonl：
- n1 = 10 + 3
- n2 = 2
- n3 = 10 ( + 2)
- n4 = 2
- n5 = 2
对于 wiki-news_en-zh_200.jsonl：
- n1 = 10 + 3
- n2 = 1
- n3 = 10 ( + 2)
- n4 = 1
- n5 = 3
- 为了更快生成示例，随机选择8种提示中的一种，而不是为每种提示生成 n5 个查询。

5,000+

优质数据集

54 个

任务类型

进入经典数据集