revflask/blockchain-benchmark

Name: revflask/blockchain-benchmark
Creator: revflask
Published: 2024-05-16 19:17:56
License: 暂无描述

Hugging Face2024-05-16 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/revflask/blockchain-benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit annotations_creators: - no-annotation language_creators: - expert-generated task_categories: - question-answering task_ids: - multiple-choice-qa language: - en tags: - blockchain - code - benchmark pretty_name: LLM Blockchain Benchmark size_categories: - 10K<n<100K dataset_info: - config_name: math features: - name: question dtype: string - name: choices sequence: string - name: answer dtype: class_label: names: '0': A '1': B '2': C '3': D - config_name: general-reasoning features: - name: question dtype: string - name: choices sequence: string - name: answer dtype: class_label: names: '0': A '1': B '2': C '3': D - config_name: code features: - name: question dtype: string - name: choices sequence: string - name: answer dtype: class_label: names: '0': A '1': B '2': C '3': D configs: - config_name: math data_files: - split: test path: Math* - config_name: general-reasoning data_files: - split: test path: General* - config_name: code data_files: - split: test path: Coding* --- # Dataset Card for LLM Blockchain Benchmark ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Repository**: [github repo] ### Dataset Summary The Blockchain Benchmark Dataset is a comprehensive collection of data specifically curated for benchmarking Language Models (LMs) in the domain of blockchain technology. This dataset is designed to facilitate research and development in natural language understanding within the blockchain domain. A complete list of tasks: ['general-reasoning', 'code', 'math'] ### Supported Tasks and Leaderboards | Model | Authors | Humanities | Social Science | STEM | Other | Average | |------------------------------------|----------|:-------:|:-------:|:-------:|:-------:|:-------:| [add tested models here] ### Languages English ## Dataset Structure ### Data Instances An example from code subtask looks as follows: ``` { "question": "The defining idea of Uniswap v3 token is", "choices": ['Concentrated Liquidity', 'Diluted Liquidity', 'Concentrated Programming', 'Optimized price ranges'], "answer": "A" } ``` ### Data Fields - `question`: a string feature - `choices`: a list of 4 string features - `answer`: a ClassLabel feature ### Data Splits - `test`: all data under test for benchmarking ## Dataset Creation ### Curation Rationale This dataset addresses the scarcity of benchmarks designed specifically for Language Models (LMs) in the realm of blockchain technology. With the intersection of blockchain and LM technologies gaining traction, a focused dataset becomes essential. This collection serves as a vital resource for advancing research and understanding within the dynamic blockchain landscape. ### Source Data #### Initial Data Collection and Normalization [More Information Needed] #### Who are the source language producers? [More Information Needed] ### Annotations #### Annotation process [More Information Needed] #### Who are the annotators? [More Information Needed] ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators [More Information Needed] ### Licensing Information MIT License ### Citation Information If you find this useful in your research, please consider supporting Tensorplex [link] ### Contributions Thanks to Tensorplex for adding this dataset.

提供机构：

revflask

原始信息汇总

数据集概述

数据集名称

LLM Blockchain Benchmark

数据集大小

10K<n<100K

数据集语言

英语 (en)

数据集标签

区块链
代码
基准测试

数据集任务类别

问答
多选题QA

数据集特征

问题 (question): 字符串类型
选项 (choices): 字符串序列
答案 (answer): 类别标签类型，选项为A, B, C, D

数据集配置

数学 (math)
- 测试数据路径: Math*
通用推理 (general-reasoning)
- 测试数据路径: General*
代码 (code)
- 测试数据路径: Coding*

许可证

MIT License

5,000+

优质数据集

54 个

任务类型

进入经典数据集