five

maximuspowers/muat-mean-std-large

收藏
Hugging Face2025-12-06 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/maximuspowers/muat-mean-std-large
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: en task_categories: - text-generation --- # Subject Models for Interpretability Training These examples are intended for training an interpreter to: - Identify what patterns a model classifies as positive based on an activation signature, with examples of: trained model + signature → pattern identification. | Signature Extraction | | |----------------------|-----------------------------------------------------------------------------| | Neuron Profile Methods | mean, std | | Prompt Format | separate | | Signature Dataset | configs/dataset_gen/signature_dataset.json | | Model Architecture | | |----------------------|-----------------------------------------------------------------------------| | Number of Layers | 8 to 10 | | Neurons per Layer | 10 to 15 | | Activation Types | relu, gelu | | Pattern Vocab Size | 10 | | Pattern Sequence Len | 5 | | Training Datasets | | |----------------------|-----------------------------------------------------------------------------| | Enabled Patterns | palindrome, sorted_ascending, sorted_descending, alternating, contains_abc, starts_with, ends_with, no_repeats, has_majority, increasing_pairs, decreasing_pairs, vowel_consonant, first_last_match, mountain_pattern | | Patterns per Batch | 1-1 | | Pos/Neg Ratio | 1:1 | | Target Total Examples per Subject Model | 250 | | Staged Training | | |----------------------|-----------------------------------------------------------------------------| | Min Improvement Threshold | 0.05 (5.0%) | | Corruption Rate | 0.15 (15.0%) | ## Token Count Statistics | Task Type | Min Tokens | Max Tokens | Avg Tokens | |-----------|------------|------------|------------| | Classification | 7699 | 18864 | 12619.8 | ## Dataset Fields | Field | Description | |----------------------|-----------------------------------------------------------------------------| | example_id | Unique identifier for each example | | metadata | JSON string containing: | | | - `target_pattern`: The pattern that was corrupted during training | | | - `degraded_accuracy`: Accuracy of the model trained on corrupted data | | | - `improved_accuracy`: Accuracy of the model after training on clean data | | | - `improvement`: Delta between degraded and improved accuracy | | | - `model_config`: Subject model architecture and hyperparameters | | | - `corruption_stats`: Details about label corruption | | | - `selected_patterns`: All patterns in the subject model's training dataset | | | - `precision`: Model weight precision | | | - `quantization`: Quantization type applied to weights | | | - `config_signature`: Hash of critical config fields for validation | | classification_prompt | Input prompt with improved model weights and signature | | classification_completion | Target completion identifying the pattern | | classification_text | Full concatenated text (prompt + completion) |
提供机构:
maximuspowers
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作