five

Jcrandall541/ethereum-arbitrage

收藏
Hugging Face2025-11-23 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Jcrandall541/ethereum-arbitrage
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - text-generation - question-answering language: - en tags: - crypto - defi - ethereum - solidity - trading - documentation - code size_categories: - 100K<n<1M --- # Crypto & DeFi Documentation Dataset A comprehensive dataset of cryptocurrency, DeFi, and blockchain documentation and code suitable for LLM training. ## Dataset Description This dataset contains scraped and processed documentation from various crypto/DeFi sources including: - **Rust Ethereum libraries** (ethers-rs, etc.) - **Solidity documentation** (official Solidity language docs) - **Smart contracts** (Uniswap, Aave, Balancer, SushiSwap, etc.) - **Trading bots** (MEV, flashloans, arbitrage) - **Protocol documentation** (Tenderly, Alchemy, etc.) ## Dataset Statistics - **Total Records**: 794,655 - **Estimated Tokens**: 75,890,740 - **Created**: 2025-11-23T02:36:21.391841 ### By Category | Category | Count | |----------|-------| | code | 9,153 | | data | 885 | | documentation | 698,443 | | infrastructure | 5,954 | | smart_contract | 76,307 | | trading_bot | 3,913 | ### By Language | Language | Count | |----------|-------| | rust | 483,803 | | unknown | 177,872 | | javascript | 71,476 | | solidity | 47,370 | | typescript | 9,912 | | python | 2,871 | | markdown | 1,235 | | toml | 76 | | console | 22 | | ts14 | 5 | | json | 3 | | b | 3 | | md | 1 | | ts90 | 1 | | ts304 | 1 | ## Data Format Each record is a JSON object with the following fields: ```json { "id": "unique_hash_id", "source": "https://github.com/...", "file": "original_filename.sol", "chunk_id": 0, "category": "smart_contract", "language": "solidity", "content": "// SPDX-License-Identifier...", "token_estimate": 150 } ``` ## Usage ```python from datasets import load_dataset # Load the dataset dataset = load_dataset("your-username/crypto-defi-docs", split="train") # Filter by category contracts = dataset.filter(lambda x: x['category'] == 'smart_contract') # Filter by language solidity = dataset.filter(lambda x: x['language'] == 'solidity') ``` ## Sources - docs.rs (Rust crate documentation) - docs.soliditylang.org (Solidity official docs) - GitHub repositories (Uniswap, Flashbots, etc.) - Protocol documentation (Tenderly, Alchemy, Balancer, etc.) ## License This dataset is provided for educational and research purposes. Individual components may have their own licenses. Please check the original sources for licensing information.
提供机构:
Jcrandall541
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作