five

Synthetic Spanish Translator Invoices Datapack

收藏
Snowflake2024-09-13 更新2024-09-14 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZ1MOZ7BX2P
下载链接
链接失效反馈
官方服务:
资源简介:
This datapack consists of synthetic invoices from Spanish-language translation services, crafted for machine learning model training in invoice automation and document processing. The invoices include detailed service descriptions, hourly rates, and total costs, all presented in Spanish. Created in a 3D environment, these documents replicate real-world conditions such as document wear and tear, ensuring models can handle diverse scenarios. Field-level annotations provide exact details, making this dataset ideal for those building automated financial systems for the translation and service industries. This datapack includes three tables: ANNOTATION_VIEW, IMAGE_VIEW, and ZIP_VIEW.<br/>**ANNOTATION_VIEW** contains information for each annotation field including the name of the field, the text within the field, 4 corner coordinates of the field in clockwise order, and the name of the image this annotation belongs to.<br/>**IMAGE_VIEW** contains information for each image including its name, its size, its URL, and the coordinates of the document corners in the image.<br/>**ZIP_VIEW** contains the URL to download the zip file containing all images and annotations in the format of Mindtech, ICDAR2015 and Wildreceipt.<br/>Please contact Mindtech for the full datapack.

本数据集包包含源自西班牙语翻译服务场景的合成发票,专为发票自动化与文档处理领域的机器学习模型训练打造。此类发票涵盖详细的服务说明、时薪标准与总费用,全部以西班牙语呈现。该数据集包于3D环境中生成,所构建的文档还原了包括文档磨损在内的真实使用场景,可确保模型能够适配多样的实际业务情况。字段级标注提供了精准的细节信息,使得该数据集非常适合用于构建面向翻译与服务行业的自动化财务系统。 本数据集包包含三张数据表:注释视图(ANNOTATION_VIEW)、图像视图(IMAGE_VIEW)与压缩包视图(ZIP_VIEW)。 **注释视图(ANNOTATION_VIEW)**包含每个标注字段的相关信息,包括字段名称、字段内文本、按顺时针顺序排列的字段四角坐标,以及该标注所属的图像名称。 **图像视图(IMAGE_VIEW)**包含每张图像的相关信息,包括图像名称、尺寸、统一资源定位符(URL),以及图像中文档四角的坐标。 **压缩包视图(ZIP_VIEW)**提供了下载该压缩包的统一资源定位符(URL),该压缩包包含所有图像与标注文件,格式适配Mindtech、ICDAR2015及Wildreceipt标准。 如需获取完整数据集包,请联系Mindtech。
提供机构:
Mindtech Global
创建时间:
2024-09-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作