Palmer Penguins 100k
收藏ieee-dataport.org2025-03-26 收录
下载链接:
https://ieee-dataport.org/documents/palmer-penguins-100k-0
下载链接
链接失效反馈官方服务:
资源简介:
To provide machine learning and data science experts with a more robust dataset for model training, the well-known Palmer Penguins dataset has been expanded from its original 344 rows to 100,000 rows. This substantial increase was achieved using an adversarial random forest technique, effectively generating additional synthetic data while maintaining key patterns and features. The method achieved an impressive accuracy of 88%, ensuring the expanded dataset remains realistic and suitable for classification tasks. Now, users can explore more complex modeling opportunities, develop nuanced classification models, and conduct broader experiments with penguin data than was possible with the limited original dataset. This scaled-up dataset opens new possibilities for data scientists, enabling enhanced model performance testing, more detailed training procedures, and diverse feature exploration. By expanding this beloved dataset, the aim is to foster innovation and facilitate deeper insights within the machine learning community.
为向机器学习和数据科学专家提供更坚实的模型训练数据集,广为人知的帕尔默企鹅数据集已从最初的344行扩展至10万行。此次规模的显著提升是通过对抗性随机森林技术实现的,该技术有效生成了额外的合成数据,同时保持了关键模式和特征。该方法实现了令人瞩目的88%准确率,确保扩展后的数据集保持现实性,适用于分类任务。如今,用户能够探索更为复杂的建模机会,开发细致入微的分类模型,并利用企鹅数据进行比原先有限的数据集更为广泛的实验。这一规模扩大的数据集为数据科学家开辟了新的可能性,使其能够进行增强的模型性能测试,更详细的训练过程以及多样化的特征探索。通过扩展这一深受喜爱的数据集,旨在激发机器学习领域的创新,并促进对该领域更深刻的洞察。
提供机构:
IEEE Dataport



