Mr-Dintov/Legal-Contract-Clause-Risk-Corpus
收藏Hugging Face2026-04-07 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Mr-Dintov/Legal-Contract-Clause-Risk-Corpus
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-classification
- token-classification
- question-answering
language:
- en
tags:
- legal
pretty_name: Legal Contract Clause Risk Corpus (Sample)
size_categories:
- n<1K
---
# Legal Contract Clause Risk Corpus — Sample
This repository contains a representative sample of the Legal Contract Clause Risk Corpus. The full dataset is available upon request.
Synthetic legal dataset engineered for clause-level contract risk classification,
fine-tuning, and legal NLP research. Built around US Delaware, UK English Law,
and ICC International jurisdictions.
## What this sample covers
Each entry is built around a single contract clause type and contains two
structured versions: a dangerous version and a balanced market-standard version.
Annotation operates at the phrase level, not the clause level.
Each entry includes:
- Clause text in dangerous and safe variants
- Phrase-level danger annotations with severity classification
- Risk score and asymmetry score
- Financial exposure estimates across contract size tiers
- Dispute probability
- Cross-jurisdictional analysis (US Delaware, UK UCTA, ICC UNIDROIT)
- Training signals with confidence scores
- False positive guards
- Embedding guidance for vector clustering
- Regulatory horizon notes including EU AI Act implications
## Clause types
Indemnification, Limitation of Liability, IP Assignment, Data Processing,
Termination, Auto-Renewal
## Intended use
Fine-tuning language models for contract intelligence, clause-level risk
classification, legal reasoning benchmarks, NLP research on contractual
asymmetry and financial exposure modeling.
## Dataset structure
Format: JSON
Each record contains: clause_id, classification, risk_assessment,
dangerous_version, balanced_version, training_signals,
cross_jurisdictional_notes, regulatory_and_compliance, metadata
## Full dataset access
The full dataset covering all clause types, jurisdictions, and risk levels
is available for research, licensing, and collaboration upon request.
Contact: mr-dintov@protonmail.com
## Important note
This dataset is fully synthetic. It does not contain or derive from any real
contracts, client documents, or proprietary legal materials.
提供机构:
Mr-Dintov



