five

LHRS-UM-FERI/MENTHOS-dataset-rootcause

收藏
Hugging Face2026-04-05 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/LHRS-UM-FERI/MENTHOS-dataset-rootcause
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en - sl tags: - menthos - root-cause - logs - binary-classification size_categories: - 1K<n<10K --- # MENTHOS-dataset-rootcause ## English ### About MENTHOS-dataset-rootcause is a binary log classification dataset built for root-cause identification experiments. ### Source Data - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-training-data.csv - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/validation-data/root-cause-validation-data-input.jsonlines - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-unseen-errors.csv ### Processing and Balancing - Input sources are merged into a single dataframe (`label`, `log`). - Split into 70% train, 15% validation, 15% test. - Balancing: each split is balanced by downsampling both classes to the same count. ### Splits and Class Distribution The prepared train, validation, and test splits are included with the dataset release. | split | rows | label 0 | label 1 | | ---------- | ---: | ------: | ------: | | train | 1266 | 633 | 633 | | validation | 296 | 148 | 148 | | test | 296 | 148 | 148 | ### Citation ``` @misc{borovic_li-dobnik_kranjec_ferme_2026, title = {MENTHOS-dataset-rootcause}, author = {Borovic, Li Dobnik, Kranjec, Ferme}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/datasets/LHRS-UM-FERI/MENTHOS-dataset-rootcause}} } ``` --- ## Slovenščina ### O datasetu MENTHOS-dataset-rootcause je binarni dataset zapisov za eksperimente zaznavanja koreninskih vzrokov. ### Izvorni podatki - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-training-data.csv - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/validation-data/root-cause-validation-data-input.jsonlines - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-unseen-errors.csv ### Obdelava in uravnoteženje - Vhodni viri se združijo v enoten format (`label`, `log`). - Razdelitev 70% train, 15% validation, 15% test. - Uravnoteženje: vsak split je uravnotežen z downsamplingom obeh razredov. ### Delitve in porazdelitev razredov Pripravljene train, validation in test delitve so vključene v izdajo nabora podatkov. | split | vrstic | label 0 | label 1 | | ---------- | -----: | ------: | ------: | | train | 1266 | 633 | 633 | | validation | 296 | 148 | 148 | | test | 296 | 148 | 148 | ### Citiranje ``` @misc{borovic_li-dobnik_kranjec_ferme_2026, title = {MENTHOS-dataset-rootcause}, author = {Borovic, Li Dobnik, Kranjec, Ferme}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/datasets/LHRS-UM-FERI/MENTHOS-dataset-rootcause}} } ```

--- 语言: - 英语(English) - 斯洛文尼亚语(Slovenščina) 标签: - MENTHOS - 根本原因(root-cause) - 日志(logs) - 二分类(binary-classification) 规模类别: - 1000条 < 样本量 < 10000条 --- # MENTHOS故障根因数据集 ## 英语版本 ### 数据集概述 本数据集(MENTHOS-dataset-rootcause)是一款专为根因识别实验构建的二分类日志分类数据集。 ### 原始数据源 - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-training-data.csv - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/validation-data/root-cause-validation-data-input.jsonlines - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-unseen-errors.csv ### 数据处理与均衡 - 将所有输入数据源合并为单一数据框(包含`label`标签与`log`日志两个字段)。 - 将数据集按70%训练集、15%验证集、15%测试集的比例划分。 - 均衡策略:通过对两个类别进行下采样至相同样本量,实现各划分子集的类别均衡。 ### 数据集划分与类别分布 本次发布的数据集已包含预处理完成的训练、验证与测试划分子集。 | 数据集划分 | 样本量 | 标签0样本数 | 标签1样本数 | | ---------- | ---: | ------: | ------: | | 训练集 | 1266 | 633 | 633 | | 验证集 | 296 | 148 | 148 | | 测试集 | 296 | 148 | 148 | ### 引用格式 @misc{borovic_li-dobnik_kranjec_ferme_2026, title = {MENTHOS-dataset-rootcause}, author = {Borovic, Li Dobnik, Kranjec, Ferme}, year = {2026}, publisher = {Hugging Face}, howpublished = {url{https://huggingface.co/datasets/LHRS-UM-FERI/MENTHOS-dataset-rootcause}} } ## 斯洛文尼亚语版本翻译 ### 数据集概述 本数据集(MENTHOS-dataset-rootcause)是一款专为根因识别实验构建的二分类日志数据集。 ### 原始数据源 - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-training-data.csv - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/validation-data/root-cause-validation-data-input.jsonlines - https://github.com/nv-morpheus/Morpheus/raw/refs/heads/branch-25.10/models/datasets/training-data/root-cause-unseen-errors.csv ### 数据处理与均衡 - 将所有输入数据源合并为单一格式(包含`label`标签与`log`日志字段)。 - 按70%训练集、15%验证集、15%测试集的比例划分数据集。 - 均衡策略:通过对两个类别执行下采样以实现各子集的类别均衡。 ### 数据集划分与类别分布 本次发布的数据集已包含预处理完成的训练、验证与测试划分子集。 | 数据集划分 | 样本量 | 标签0样本数 | 标签1样本数 | | ---------- | ---: | ------: | ------: | | 训练集 | 1266 | 633 | 633 | | 验证集 | 296 | 148 | 148 | | 测试集 | 296 | 148 | 148 | ### 引用格式 @misc{borovic_li-dobnik_kranjec_ferme_2026, title = {MENTHOS-dataset-rootcause}, author = {Borovic, Li Dobnik, Kranjec, Ferme}, year = {2026}, publisher = {Hugging Face}, howpublished = {url{https://huggingface.co/datasets/LHRS-UM-FERI/MENTHOS-dataset-rootcause}} }
提供机构:
LHRS-UM-FERI
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作