five

Dataset and additional files/softwares required for the paper "LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification from Indian Legal Documents"

收藏
Zenodo2021-12-28 更新2026-04-07 收录
下载链接:
https://zenodo.org/record/5806910
下载链接
链接失效反馈
官方服务:
资源简介:
This dump contains all files and softwares required for running the codes for the paper "LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification from Indian Legal Documents". Specifically, these codes are available at https://github.com/Law-AI/LeSICiN. LeSICiN is a deep neural network for the task of Legal Statute Identification which also uses graphical properties of the document-statute citation network for training and predictions. We have three datasets --- train, dev and test. These are all .jsonl files with each instance dict per line; each instance dict contains the unique id, list of sentences and cited labels of the particular instance. Also, there is a fourth file --- secs.jsonl, which stores the text of all the statutes in similar format. schemas.json list out the metapath schemas for fact and section type nodes, while type_map.json maps the id of each node to its type (Act/Chapter/Topic/Section/Fact). label_tree.json and citation_network.json list out the edges for the two parts of the network in the format of a 3-tuple ('source id', 'relationship type', 'target id') "ils2v.bin" is the pretrained sent2vec vectorizer that can generate a 200-dim vector for each sentence
提供机构:
Goyal, Pawan; Ghosh, Saptarshi; Paul, Shounak
创建时间:
2021-12-28
二维码
社区交流群
二维码
科研交流群
商业服务