five

Finance Starter Pack

收藏
Snowflake2024-03-12 更新2024-05-01 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZ2FQZS74R6
下载链接
链接失效反馈
官方服务:
资源简介:
The Gradient Finance Starter Pack can be used to support foundational LLM development, adding additional insight into financial concepts, regulations, investments, and more. The training data is pre-cleaned and is designed to accelerate your foundational model on domain-specific knowledge in finance - supporting investors, insurers, and banking institutions. Use the training data to help build: - Finance LLMs From Scratch - Finance Co-Pilots - AI Workflows Designed for Finance - Etc. Gradient is a leading innovator in building financial models, including 1) Albatross, Gradient's proprietary domain-specific model for financial services and 2) Alphatross, an earlier version of Gradient's Albatross model with limited capabilities - made available on Hugging Face. Dataset examples: ```python # Here we plot a reliablity diagram with custom bins (10 equally spaced). # Since we have only 100 data points it is hard to see much of a pattern # except that most points seem above the line. plt.figure(figsize=(10,4)) mli.plot_reliability_diagram(yvec,xvec, bins=np.linspace(0,1,11), show_histogram=True); ``` ![png](output_35_0.png) ```python # Try SplineCalib with the default settings sc4 = mli.SplineCalib() sc4.fit(xvec, yvec) ``` ```python # Here is the resulting calibration curve sc4.show_calibration_curve() ``` ![png](output_37_0.png) ```python # The curve does not go to [0,0], let's investigate... sc4.calibrate(np.array([0,1e-16,1e-10,1e-9,1e-8,1e-7,1e-6,1e-5,1e-4,1e-3,1e-2])) ``` array([0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.11966725, 0.20690302]) ```python sc4.logodds_eps, np.min(xvec) ``` (0.001, 0.004695476192547066) Dataset: (text)
提供机构:
Gradient
创建时间:
2024-03-07
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集为金融领域LLM开发提供预清洗的训练数据,涵盖金融概念、法规和投资等知识,支持构建金融专用模型和AI工作流。包含Gradient公司开发的Albatross等金融领域专业模型资源。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作