Ethereum_blockchain_parquet
收藏数据集概述
基本信息
- 许可证: GPL-3.0
- 语言: 英语
- 创建方式: 机器生成
- 标签:
- ethereum-blockchain
- parquet
- 下载大小: 20.9 GB
数据来源
- 使用基于Rust的工具cryo结合Ankr提供的web3 API提取区块链数据为parquet文件。
- 需要创建Ankr账户,即使使用免费模式。
时间范围
- 最小时间戳: 2016-11-17 00:40:08
- 最大时间戳: 2025-03-25 21:17:35
表结构
块表 (blocks/*.parquet)
plaintext Schema([(block_hash, Binary), (author, Binary), (block_number, UInt32), (gas_used, UInt64), (extra_data, Binary), (timestamp, UInt32), (base_fee_per_gas, UInt64), (chain_id, UInt64)])
交易表 (transactions/*.parquet)
plaintext Schema([(block_number, UInt32), (transaction_index, UInt64), (transaction_hash, Binary), (nonce, UInt64), (from_address, Binary), (to_address, Binary), (value_binary, Binary), (value_string, String), (value_f64, Float64), (input, Binary), (gas_limit, UInt64), (gas_used, UInt64), (gas_price, UInt64), (transaction_type, UInt32), (max_priority_fee_per_gas, UInt64), (max_fee_per_gas, UInt64), (success, Boolean), (n_input_bytes, UInt32), (n_input_zero_bytes, UInt32), (n_input_nonzero_bytes, UInt32), (chain_id, UInt64)])
表连接
- 块表和交易表可通过
block_number列进行连接。
示例查询
Polars LazyFrame 示例
块数据查询
python import polars as pl
def sample_query_blocks(folder): q1 = ( pl.scan_parquet(folder, glob=True) .with_columns([ pl.col("block_hash").bin.encode("hex").alias("block_hash_encode"), pl.col("author").bin.encode("hex").alias("author_encode"), pl.col("extra_data").bin.encode("hex").alias("extra_data_encode"), pl.from_epoch(pl.col("timestamp"), time_unit="s").alias("timestamp") ]) .drop("block_hash", "author", "extra_data") .limit(5) ) return q1.collect()
交易数据查询
python import polars as pl
def sample_query_tx(folder): q1 = ( pl.scan_parquet(folder, glob=True) .with_columns([ pl.col("from_address").bin.encode("hex").alias("from_address_encode"), pl.col("to_address").bin.encode("hex").alias("to_address_encode"), pl.col("transaction_hash").bin.encode("hex").alias("transaction_hash_encode") ]) .select("block_number", "from_address_encode", "to_address_encode", "transaction_hash_encode", "value_f64", "gas_limit", "gas_used", "gas_price") .limit(5) ) return q1.collect()




