five

hanhainebula/BGE-Benchmark-Examples

收藏
Hugging Face2024-03-19 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/hanhainebula/BGE-Benchmark-Examples
下载链接
链接失效反馈
官方服务:
资源简介:
# BGE Benchmark Examples ## Data Format - `special_examples.jsonl`: each domain (wiki, news) for each language (en, zh) has 3 examples, 12 examples in total. Each example has the following format: ```python { "domain": str, "language": str, "text": str, "characters": List[str], "sampled_characters": List[{ "character": str, "scenarios": List[str], "sampled_scenarios": List[{ "scenario": str, "result": { "prompt1-1": List[{ "query": str, "diversify": { "informal": str, "complicated": str, }, "hard_negative": str, }], "prompt2-1": ..., "prompt3-1": ..., "prompt4-1": ..., "prompt1-2": ..., "prompt2-2": ..., "prompt3-2": ..., "prompt4-2": ..., }, }], }], } ``` - `wiki-news_en-zh_200.jsonl`: each domain (wiki, news) for each language (en, zh) has 50 examples, 200 examples in total. Each example has the following format: ```python {'query': str, 'positive': str, 'hard_negative': str} ``` ## Method LLM: `gpt-4-turbo-preview` For each text → Generate n1 characters → Sample n2 characters ⇒ For each character → Generate n3 scenarios → Sample n4 scenarios ⇒ For each scenario → Generate n5 queries for each prompt (*8 kinds of prompts*) ⇒ For each query → Diversify it in 2 methods & Generate 1 hard negative ⇒ DONE for one example. There are some human-designed characters and scenarios used when generating queries: ```python HUMAN_WRITTEN_CHARACTERS = [ { 'character': 'Professor', 'scenarios': [ 'Setupping questions for an upcoming quiz/examination', 'Preparing for a lecture', ], }, { 'character': 'College student', 'scenarios': [ 'Preparing for an examination', 'Writing a research paper', ], }, { 'character': 'High school student', 'scenarios': [ 'Learning new knowledge', 'Preparing for a presentation', ], } ] ``` For examples in `special_examples.jsonl`, we set n1 = 10 + 3, n2 = 2, n3 = 10 ( + 2), n4 = 2, n5 = 2. For examples in `wiki-news_en-zh_200.jsonl`, we set n1 = 10 + 3, n2 = 1, n3 = 10 ( + 2), n4 = 1, n5 = 3. To generate example in a faster way, we randomly choose one prompt from the 8 kinds of prompts, instead of genrating n5 queries for each prompt (*8 kinds of prompts*).
提供机构:
hanhainebula
原始信息汇总

BGE Benchmark Examples 数据集概述

数据格式

special_examples.jsonl

  • 每个领域(wiki, news)和每种语言(en, zh)各有3个示例,总共12个示例。
  • 每个示例的格式如下: python { "domain": str, "language": str, "text": str, "characters": List[str], "sampled_characters": List[{ "character": str, "scenarios": List[str], "sampled_scenarios": List[{ "scenario": str, "result": { "prompt1-1": List[{ "query": str, "diversify": { "informal": str, "complicated": str, }, "hard_negative": str, }], "prompt2-1": ..., "prompt3-1": ..., "prompt4-1": ..., "prompt1-2": ..., "prompt2-2": ..., "prompt3-2": ..., "prompt4-2": ..., }, }], }], }

wiki-news_en-zh_200.jsonl

  • 每个领域(wiki, news)和每种语言(en, zh)各有50个示例,总共200个示例。
  • 每个示例的格式如下: python {query: str, positive: str, hard_negative: str}

生成方法

  • 使用 gpt-4-turbo-preview 模型。
  • 生成过程包括:
    1. 对每个文本生成 n1 个角色。
    2. 从 n1 个角色中采样 n2 个角色。
    3. 对每个角色生成 n3 个场景。
    4. 从 n3 个场景中采样 n4 个场景。
    5. 对每个场景生成 n5 个查询(针对8种提示)。
    6. 对每个查询进行两种方式的多样化处理并生成1个硬负例。

人工设计的角色和场景

python HUMAN_WRITTEN_CHARACTERS = [ { character: Professor, scenarios: [ Setupping questions for an upcoming quiz/examination, Preparing for a lecture, ], }, { character: College student, scenarios: [ Preparing for an examination, Writing a research paper, ], }, { character: High school student, scenarios: [ Learning new knowledge, Preparing for a presentation, ], } ]

参数设置

  • 对于 special_examples.jsonl

    • n1 = 10 + 3
    • n2 = 2
    • n3 = 10 ( + 2)
    • n4 = 2
    • n5 = 2
  • 对于 wiki-news_en-zh_200.jsonl

    • n1 = 10 + 3
    • n2 = 1
    • n3 = 10 ( + 2)
    • n4 = 1
    • n5 = 3
    • 为了更快生成示例,随机选择8种提示中的一种,而不是为每种提示生成 n5 个查询。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作