JuDDGES/pl-nsa-enriched
收藏Hugging Face2026-01-17 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/JuDDGES/pl-nsa-enriched
下载链接
链接失效反馈官方服务:
资源简介:
---
language: pl
multilinguality: monolingual
size_categories: 1M<n<10M
source_datasets:
- JuDDGES/pl-nsa
pretty_name: Polish NSA Judgments (Enriched)
tags:
- legal
- polish
- enriched
- gemini
- factual-state
- legal-state
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
dataset_info:
features:
- name: country
dtype: string
- name: court_type
dtype: string
- name: source
dtype: string
- name: judgment_id
dtype: string
- name: docket_number
dtype: string
- name: judgment_type
dtype: string
- name: finality
dtype: string
- name: judgment_date
dtype: timestamp[ms, tz=Europe/Warsaw]
- name: submission_date
dtype: timestamp[ms, tz=Europe/Warsaw]
- name: court_name
dtype: string
- name: judges
sequence: string
- name: presiding_judge
dtype: string
- name: judge_rapporteur
dtype: string
- name: case_type_description
sequence: string
- name: keywords
sequence: string
- name: related_docket_numbers
list:
- name: judgment_id
dtype: string
- name: docket_number
dtype: string
- name: judgment_date
dtype: timestamp[ms, tz=Europe/Warsaw]
- name: judgment_type
dtype: string
- name: challenged_authority
dtype: string
- name: decision
sequence: string
- name: extracted_legal_bases
list:
- name: link
dtype: string
- name: article
dtype: string
- name: journal
dtype: string
- name: law
dtype: string
- name: official_collection
sequence: string
- name: glosa_information
sequence: string
- name: thesis
dtype: large_string
- name: sentence
dtype: large_string
- name: reasons_for_judgment
dtype: large_string
- name: dissenting_opinion
dtype: large_string
- name: full_text
dtype: large_string
- name: extracted_summary
dtype: string
- name: extracted_thesis
dtype: string
- name: extracted_factual_state
dtype: string
- name: extracted_legal_state
dtype: string
- name: extracted_keywords
sequence: string
- name: extracted_outcome
dtype: string
- name: extracted_legal_references
dtype: string
- name: extracted_legal_concepts
dtype: string
- name: extracted_parties
dtype: string
- name: extracted_legal_analysis
dtype: string
- name: extracted_judgment_specific
dtype: string
- name: extracted_tax_specific
dtype: string
splits:
- name: train
num_bytes: 68704911717
num_examples: 2254392
download_size: 32369902394
dataset_size: 68704911717
---
# Polish NSA Judgments (Enriched)
Polish Supreme Administrative Court judgments enriched with Gemini-extracted factual_state and legal_state fields.
## Dataset Description
This dataset is an enriched version of [JuDDGES/pl-nsa](https://huggingface.co/datasets/JuDDGES/pl-nsa) with additional fields extracted using Google Gemini 2.5 Pro.
### New Fields
#### Core Extracted Fields
| Field | Type | Description |
|-------|------|-------------|
| `factual_state` | string | Objective narrative of facts (stan faktyczny) - the factual circumstances forming the basis for the case |
| `legal_state` | string | Legal framework and provisions (stan prawny) - applicable laws and legal provisions used in reasoning |
| `extracted_title` | string | Extracted document title |
| `extracted_date_issued` | string | Extracted issue date (YYYY-MM-DD format) |
| `extracted_summary` | string | Brief summary of the document |
| `extracted_thesis` | string | Legal thesis or principle established by the document |
| `extracted_keywords` | JSON string | List of keywords extracted from the document |
#### Structured Legal Data
| Field | Type | Description |
|-------|------|-------------|
| `extracted_outcome` | JSON string | Decision outcome with decision_type and decision_summary |
| `extracted_legal_references` | JSON string | List of cited laws, regulations, and legal acts |
| `extracted_legal_concepts` | JSON string | Legal concepts mentioned with definitions and context |
| `extracted_parties` | JSON string | Parties involved in the case with roles and representation |
| `extracted_legal_analysis` | JSON string | Detailed legal reasoning analysis |
#### Document-Type Specific Fields
| Field | Type | Description |
|-------|------|-------------|
| `extracted_judgment_specific` | JSON string | Fields specific to court judgments |
| `extracted_tax_interpretation_specific` | JSON string | Fields specific to tax interpretations |
### Data Processing
- **Extraction Model**: Google Gemini 2.5 Pro
- **Extraction Method**: Structured output extraction with Polish legal schema
- **Join Strategy**: Primary join on `document_id`, fallback on `document_number`
## Usage
```python
from datasets import load_dataset
import json
dataset = load_dataset("JuDDGES/pl-nsa-enriched")
# Access text fields directly
print(dataset['train'][0]['factual_state'])
# Parse JSON fields
legal_refs = json.loads(dataset['train'][0]['extracted_legal_references'])
```
## Citation
If you use this dataset, please cite the original dataset and the JuDDGES project.
## License
Same as the original dataset: [JuDDGES/pl-nsa](https://huggingface.co/datasets/JuDDGES/pl-nsa)
---
*Generated on 2026-01-14 by JuDDGES enrichment pipeline*
提供机构:
JuDDGES



