Innovative-Process-Applications/roller-compaction-ribbon-density
收藏Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Innovative-Process-Applications/roller-compaction-ribbon-density
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
language:
- en
pretty_name: "Roller Compaction: Ribbon Density vs. Process Parameters (Synthetic)"
size_categories:
- n<1K
task_categories:
- tabular-regression
tags:
- roller-compaction
- pharmaceutical-manufacturing
- design-of-experiments
- response-surface-methodology
- quality-by-design
- synthetic-data
- powder-processing
- heckel-equation
- johanson-model
- process-engineering
- education
configs:
- config_name: default
data_files:
- split: train
path: "ribbon_density_v1.0.csv"
---
# Roller Compaction: Ribbon Density vs. Process Parameters (Synthetic)
**Version:** 1.0
**Publisher:** [Innovative Process Applications (IPA)](https://www.innovativeprocess.com)
**License:** Creative Commons Attribution 4.0 International (CC BY 4.0)
**Contact:** Crestwood, IL, USA
> ⚠️ **This dataset is 100% synthetic and intended for educational use only.**
> It was generated from a published physical model (Johanson rolling theory + Heckel densification) — not measured on any real equipment, customer, or production batch. Do not use it for regulatory submissions, equipment validation, or commercial process design.
---
## What's in this dataset
600 simulated roller compaction runs on a representative pharmaceutical excipient blend (microcrystalline cellulose / lactose), spanning the operating envelope of a lab-to-pilot scale twin-feed-screw roller compactor.
| Column | Units | Description |
|---|---|---|
| `run_id` | — | Unique run identifier |
| `roll_force_kN_per_cm` | kN/cm | Specific roll force (normalized by roll width) |
| `roll_speed_rpm` | rpm | Roll rotation speed |
| `feed_screw_rpm` | rpm | Twin feed screw rotation speed |
| `roll_gap_mm` | mm | Target ribbon thickness / roll gap |
| `peak_pressure_MPa` | MPa | Computed peak nip pressure (Johanson) |
| `ribbon_rel_density` | — | Relative density (fraction of true density) |
| `ribbon_density_g_cc` | g/cc | Absolute ribbon density |
| `ribbon_porosity` | — | 1 − relative density |
| `density_CV_percent` | % | Across-width density coefficient of variation (uniformity) |
| `throughput_kg_hr` | kg/hr | Mass throughput |
## Realistic ranges (grounded in the literature)
The generator uses ranges consistent with published roller compaction studies on MCC/lactose blends:
- **Roll force:** 2–14 kN/cm (typical lab/pilot range)
- **Roll speed:** 1–12 rpm
- **Feed-to-roll speed ratio:** 3–30 (optimum ≈ 11 for twin feed screws on MCC)
- **Peak nip pressure:** ~30–200 MPa
- **Relative density:** 0.55–0.80 (ribbons below ~0.55 tend to crumble; above ~0.85 you over-compact and lose downstream granulation behavior)
- **Material:** true density 1.55 g/cc, bulk density 0.45 g/cc, Heckel K ≈ 0.018 MPa⁻¹
## Physical model
The synthetic data is generated from:
1. **Johanson rolling theory (1965)** — peak nip pressure as a function of roll force, gap, and roll geometry.
2. **Heckel equation** — relative density as a function of applied pressure: `ln(1/(1−D)) = K·P + A`.
3. **Twin feed screw effect** — a Gaussian optimality response centered on feed/roll ratio ≈ 11, penalizing both starved and over-fed nip conditions. This reflects IPA's twin-feed-screw design advantage for maintaining uniform nip feeding.
4. **Realistic measurement noise** (~1.5% on density, proportional noise on uniformity and throughput).
Full generator source is in `generate_dataset.py` — reproducible with seed 42.
## What you can teach with it
- **DOE / Response Surface Methodology:** fit a quadratic model to ribbon density as a function of roll force, roll speed, and feed screw speed.
- **Process optimization:** find the operating window that maximizes density while keeping CV% under a target (e.g., < 3%).
- **Regression & ML:** compare linear regression, random forests, and Gaussian processes on a small-but-physical dataset.
- **Quality-by-Design (QbD):** illustrate design space, critical process parameters (CPPs), and critical quality attributes (CQAs).
## Cross-links (other places to find this dataset)
- **Kaggle:** [(https://www.kaggle.com/innovativeprocapps)]
- **Hugging Face Datasets:** [link after publication]
- **Zenodo (DOI):** [https://zenodo.org/records/19500776]
- **GitHub:** [https://github.com/Innovative-Process-Applications/ipa-datasets/tree/main/ipa-rc-dataset-v1.0]
- **IPA website:** https://www.innovativeprocess.com
## About IPA
Innovative Process Applications designs and manufactures twin-feed-screw roller compactors, mills, and size-reduction equipment for the pharmaceutical, nutraceutical, chemical, and food industries. Based in Crestwood, Illinois, IPA is a direct OEM alternative to legacy Fitzpatrick Chilsonator and FitzMill systems, with American manufacturing and direct engineer access. Learn more at [innovativeprocess.com](https://www.innovativeprocess.com).
## Citation
If you use this dataset in teaching, a notebook, a paper, or a blog post, please cite:
> Innovative Process Applications (2026). *Roller Compaction: Ribbon Density vs. Process Parameters (Synthetic), v1.0*. CC BY 4.0. https://www.innovativeprocess.com
## Version history
- **v1.0** (April 2026) — Initial release. 600 runs, 4 process parameters, 6 response variables.
提供机构:
Innovative-Process-Applications
搜集汇总
数据集介绍

构建方式
在制药工程领域,过程参数的精确控制对产品质量至关重要。该数据集通过合成生成方法构建,基于Johanson滚动理论与Heckel致密化方程,模拟了600次实验室至中试规模双进料螺杆辊压机运行。生成过程结合了辊压力、辊速、进料螺杆转速等关键工艺参数,并引入高斯最优响应模型以反映双进料螺杆设计的均匀喂料优势,同时添加了约1.5%的密度测量噪声,确保数据在物理模型基础上的现实性与可重复性。
特点
数据集聚焦于制药制造中的辊压工艺,其特点在于完全基于合成数据,专为教育目的设计。它涵盖了微晶纤维素/乳糖混合物的代表性操作范围,包含辊压力、辊速、进料螺杆转速等四个过程参数,以及峰值压力、带相对密度、密度变异系数等六个响应变量。数据范围基于文献中的典型值,如辊压力2-14 kN/cm,相对密度0.55-0.80,确保了物理合理性与教学实用性,适用于实验设计、响应面方法等统计分析。
使用方法
该数据集适用于制药工程与过程优化领域的教学与研究。用户可将其用于实验设计与响应面方法学,拟合带密度与工艺参数之间的二次模型;也可进行过程优化,寻找在密度变异系数低于目标值(如3%)时最大化带密度的操作窗口。此外,数据集支持回归与机器学习模型的比较,如线性回归、随机森林和高斯过程,有助于展示质量源于设计理念中的设计空间与关键工艺参数分析。
背景与挑战
背景概述
在制药工业的连续制造进程中,辊压压实技术作为干法制粒的核心环节,其工艺优化对于确保药品质量至关重要。该合成数据集由创新工艺应用公司于2026年发布,基于Johanson滚动理论与Heckel致密化模型构建,旨在模拟微晶纤维素与乳糖混合物的压实行为。数据集聚焦于揭示辊压力、辊速与进料螺杆转速等关键工艺参数对压片密度与均匀性的影响,为质量源于设计理念提供了数据支撑,推动了制药工程领域的过程建模与优化研究。
当前挑战
该数据集致力于解决制药辊压工艺中压片密度预测与工艺参数优化的挑战,其核心在于建立多变量非线性关系的精确模型,以应对物料特性复杂性与工艺窗口狭窄的难题。在构建过程中,挑战主要体现在如何将物理模型与真实噪声相结合,确保合成数据既符合理论框架又贴近实际测量误差,同时需平衡教育用途的简化需求与工业场景的物理真实性,避免数据过度理想化而失去教学与模型验证价值。
常用场景
经典使用场景
在制药工程领域,滚压压实技术是生产固体剂型的关键工艺,该数据集通过合成数据模拟了滚压压实过程中工艺参数与带状物密度之间的关系。其经典使用场景聚焦于实验设计与响应面方法学的教学应用,研究人员可基于数据集构建二次模型,分析滚压力、滚轮转速和进料螺杆转速对带状物密度的非线性影响,从而优化工艺窗口,确保产品质量符合设计规范。
实际应用
在实际应用中,该数据集虽为合成数据,但基于真实工艺参数范围生成,可安全用于制药行业的培训与模拟。工程师能够利用数据集进行工艺窗口探索,识别临界工艺参数如滚压力和进料比,以最大化带状物密度同时控制密度变异系数,从而指导实验室至中试规模设备的操作策略,提升生产效率和产品一致性,避免在实际生产中因试错带来的资源浪费。
衍生相关工作
围绕该数据集衍生的经典工作主要集中于工艺工程与数据科学的交叉领域。例如,研究者利用其开发了基于高斯过程的工艺优化框架,以预测带状物密度并量化不确定性;同时,该数据集也启发了多项关于质量属性控制的教学案例,如在统计过程控制课程中演示设计空间的可视化,以及比较随机森林与线性回归在制药数据中的预测精度,推动了合成数据在工程教育中的创新应用。
以上内容由遇见数据集搜集并总结生成



