five

Jagged arrays in ROOT TTree, Parquet, and Avro

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14538339
下载链接
链接失效反馈
官方服务:
资源简介:
This is a synthetic dataset of random numbers in variable-length, nested data structures in three file formats: ROOT TTree, Parquet, and Avro. There are four levels of depth: jagged0: not nested; just a flat array of numbers jagged1: an array of lists of numbers jagged2: an array of lists of lists of numbers jagged3: an array of lists of lists of lists of numbers The TBasket sizes of the TTree files and the row group sizes of the Parquet files were made to be identical, so that performances can be meaningfully compared. All of the files are compressed with ZLIB level 9. This dataset was first used in a performance study at CHEP 2019: presentation page published proceedings But it has since been used in other studies, such as this one at CHEP 2021: presentation page published proceedings and this one at ACAT 2022: presentation page preprint (will be published) It has become a standard performance benchmark. The scripts that were used to create this synthetic dataset are in this repository directory, PR #19. Just one file, zlib9-jagged0.avro, had to be excluded to fit in this Zenodo record, but it is the easiest one to reconstruct from the others.
创建时间:
2024-12-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作