Service-Oriented Architecture for automatic markup of documents. An use case for legal documents

Name: Service-Oriented Architecture for automatic markup of documents. An use case for legal documents
Creator: International Federation of Library Associations and Institutions
Published: 2025-11-19 21:39:43
License: 暂无描述

IFLA Repository2025-11-19 更新2026-05-16 收录

下载链接：

https://repository.ifla.org/items/d5c654a8-bded-4783-8bec-c8913ce931dd

下载链接

链接失效反馈

官方服务：

资源简介：

The problem of information extraction and automatic markup of plain text to XML, has been resolved partially in a specific domain of legal documents. Techniques such as named entity recognition, hierarchy detection of text sections and others has led to partially identify and retrieve different kind of information inside non structured documents. In this paper we introduce different interconnected components, the NLP techniques used on each component and the workflow needed for processing a plain text document and to generate a new full marked XML version of the document. The generated XML complies with the schema legal standard Akoma-Ntoso and is highly enriched with named entities, semantic URIS, structural sections, lists and elements sequences, between others. As an use case we analyze the experience of the Library of Congress of Chile in the context of the 'History of Law project' and Parliamentary Labor, where these architecture had a key role in order to accomplish the final product and results of processing and marking up different types or models of documents used in the legislative process.

提供机构：

International Federation of Library Associations and Institutions

创建时间：

2025-09-24

5,000+

优质数据集

54 个

任务类型

进入经典数据集