Open-Source ML Products
收藏arXiv2023-08-08 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2308.04328v1
下载链接
链接失效反馈官方服务:
资源简介:
本数据集由卡内基梅隆大学等机构的研究人员创建,包含了262个从GitHub上精选的开源机器学习产品。这些产品涵盖了多种类型和用途的机器学习模型,旨在为学术研究和教育提供丰富的资源。数据集中的产品展示了多样化的开发实践和架构决策,为未来的研究创新提供了充足的机会。同时,数据集也揭示了在开源机器学习产品中缺乏行业最佳实践,如模型测试和管道自动化,这为理解这些实践对产品开发和最终用户体验的潜在影响留下了进一步研究的空间。
This dataset was created by researchers from Carnegie Mellon University and other institutions, and comprises 262 curated open-source machine learning products sourced from GitHub. These products cover machine learning models of various types and use cases, aiming to provide abundant resources for academic research and education. The products in the dataset demonstrate diverse development practices and architectural decisions, offering ample opportunities for future research innovations. Meanwhile, the dataset also reveals the absence of industry best practices such as model testing and pipeline automation in open-source machine learning products, leaving room for further research to explore the potential impacts of these practices on product development and end-user experience.
提供机构:
卡内基梅隆大学
创建时间:
2023-08-08



