five

Mr-Dintov/Legal-Contract-Clause-Risk-Corpus

收藏
Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Mr-Dintov/Legal-Contract-Clause-Risk-Corpus
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-classification - token-classification - question-answering language: - en tags: - legal pretty_name: Legal Contract Clause Risk Corpus (Sample) size_categories: - n<1K --- # Legal Contract Clause Risk Corpus — Sample This repository contains a representative sample of the Legal Contract Clause Risk Corpus. The full dataset is available upon request. Synthetic legal dataset engineered for clause-level contract risk classification, fine-tuning, and legal NLP research. Built around US Delaware, UK English Law, and ICC International jurisdictions. ## What this sample covers Each entry is built around a single contract clause type and contains two structured versions: a dangerous version and a balanced market-standard version. Annotation operates at the phrase level, not the clause level. Each entry includes: - Clause text in dangerous and safe variants - Phrase-level danger annotations with severity classification - Risk score and asymmetry score - Financial exposure estimates across contract size tiers - Dispute probability - Cross-jurisdictional analysis (US Delaware, UK UCTA, ICC UNIDROIT) - Training signals with confidence scores - False positive guards - Embedding guidance for vector clustering - Regulatory horizon notes including EU AI Act implications ## Clause types Indemnification, Limitation of Liability, IP Assignment, Data Processing, Termination, Auto-Renewal ## Intended use Fine-tuning language models for contract intelligence, clause-level risk classification, legal reasoning benchmarks, NLP research on contractual asymmetry and financial exposure modeling. ## Dataset structure Format: JSON Each record contains: clause_id, classification, risk_assessment, dangerous_version, balanced_version, training_signals, cross_jurisdictional_notes, regulatory_and_compliance, metadata ## Full dataset access The full dataset covering all clause types, jurisdictions, and risk levels is available for research, licensing, and collaboration upon request. Contact: mr-dintov@protonmail.com ## Important note This dataset is fully synthetic. It does not contain or derive from any real contracts, client documents, or proprietary legal materials.
提供机构:
Mr-Dintov
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作