five

zbst/pplx-e2b-re-data

收藏
Hugging Face2026-03-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/zbst/pplx-e2b-re-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - text-classification - feature-extraction tags: - security - reverse-engineering - e2b - sandbox - ebpf - ndpi pretty_name: "pplx-e2b-re Research Data" size_categories: - 1K<n<10K --- # pplx-e2b-re Research Platform — Dataset Research data from the Perplexity E2B sandbox reverse engineering project. Contains structured findings, probe results, embeddings, and configuration data collected across 12+ sessions. ## Dataset Structure ### Categories | Category | Files | Rows | Description | |----------|-------|------|-------------| | **probes** | 18 | 931 | Sandbox environment probes: connectors, models, features, cookies, embeddings | | **rundeck_probes** | 16 | 308 | Automated probe results: GitHub, Linear, HF, envd metrics, network peers | | **overview** | 6 | 243 | Session overviews: master catalog, inventories, connectors list | | **rundeck_state** | 7 | 163 | Rundeck operational state: manifests, compacts, toolkit, services | | **rundeck_data** | 1 | 20 | Probe data: LLM API/SDK configurations | ### Key Datasets - **findings_embeddings.parquet** — 162 research findings with 1536-dim HuggingFace embeddings for semantic search - **eppo_idb_flags.parquet** — 395 feature flags from Perplexity's Eppo integration - **models_config_v1.parquet** — 65 LLM model configurations (including unreleased models) - **copilot_models.parquet** — 42 GitHub Copilot model definitions with capabilities - **rootfs_scan.parquet** — Full sandbox root filesystem metadata (130 columns) ### Sandbox Context | Field | Value | |-------|-------| | Platform | E2B Firecracker MicroVM | | Kernel | 6.1.158 | | OS | Debian 13 (trixie) | | Template | `ij1plp1090o3fuzyuac0` | ## Usage ```python import pyarrow.parquet as pq import duckdb # Load a parquet file findings = pq.read_table("data/probes/findings_embeddings.parquet") # Query with DuckDB con = duckdb.connect() result = con.sql(""" SELECT category, count(*) as n FROM 'data/probes/findings_all.parquet' GROUP BY category ORDER BY n DESC """).fetchall() ``` ## License MIT — see [repository](https://github.com/pv-udpv/pplx-e2b-re) for full details.
提供机构:
zbst
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作