five

nlp-waseda/e_gov

收藏
Hugging Face2025-01-14 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/nlp-waseda/e_gov
下载链接
链接失效反馈
官方服务:
资源简介:
这是一个从e-Gov获取的日本法律数据集。数据集中的每条数据包含两个字段:text字段包含法律文本,metadata字段包含包括以下9个子字段的额外信息:Era(法律颁布的日本时代),Lang(文本所用的语言,均为日语),LawType(法律类型,包括宪法、法案、内阁命令、敕令、省令、规则和其他),Year(法律颁布年份),PromulgateMonth/Day(法律颁布的月份/日期),LawNum(法律的数字名称),category_id(表示法律分类的整数)。数据集被随机分为训练集、验证集和测试集,比例为8:1:1,保持原始分类分布。

This is a Japanese law dataset obtained from e-Gov. Each entry in the dataset contains two fields: text field with legal texts, and metadata field with additional information including nine subfields: Era (the Japanese Era when the law is promulgated), Lang (the language of the text, all in Japanese), LawType (the type of the law, including Constitution, Act, CabinetOrder, ImperialOrder, MinisterialOrdinance, Rule, and Misc), Year (the year when the law is promulgated), PromulgateMonth/Day (the month/day when the law is promulgated), LawNum (the numeric name of the law), and category_id (an integer representing the category of the law). The dataset is split into train, validation, and test sets at a ratio of 80%:10%:10%, preserving the original distribution of categories.
提供机构:
nlp-waseda
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作