five

IsmatS/azerbaijan-court-data

收藏
Hugging Face2026-04-10 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/IsmatS/azerbaijan-court-data
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - az - en license: cc-by-4.0 task_categories: - text-classification - text-generation - question-answering - token-classification - feature-extraction - summarization tags: - legal - law - court - azerbaijan - azerbaijani - nlp - court-decisions - judicial - case-law - lawyers - graph-rag - rag - knowledge-graph - pdf - tabular - text - ocr - document-ai - fine-tuning - embeddings pretty_name: "Azerbaijan Court System Dataset" size_categories: - 1M<n<10M --- # Azerbaijan Court System Dataset **The most comprehensive open dataset of Azerbaijan's judicial system** — 1.64 million structured records and 1.54 million court decision PDFs (~160 GB) covering court decisions, active cases, scheduled hearings, court registries, judges, lawyers, and mediator organizations. Built for AI engineers, legal tech startups, and researchers who need real-world legal data at scale. --- ## Quick Start ### Load with Hugging Face `datasets` ```python from datasets import load_dataset # Load any CSV by specifying the data file ds = load_dataset("IsmatS/azerbaijan-court-data", data_files="data/court_acts.csv") print(ds["train"][0]) ``` ### Load with pandas ```python import pandas as pd # All CSVs use UTF-8 with BOM encoding acts = pd.read_csv("hf://datasets/IsmatS/azerbaijan-court-data/data/court_acts.csv", encoding="utf-8-sig") courts = pd.read_csv("hf://datasets/IsmatS/azerbaijan-court-data/data/courts.csv", encoding="utf-8-sig") print(f"Court acts: {len(acts):,} rows") print(f"Courts: {len(courts):,} rows") ``` ### Download a specific PDF ```python from huggingface_hub import hf_hub_download import tarfile, io decision_id = 12345678 shard = str(decision_id % 1000).zfill(3) # "678" # Download the shard tar file tar_path = hf_hub_download( repo_id="IsmatS/azerbaijan-court-data", filename=f"pdfs/{shard}.tar", repo_type="dataset" ) # Extract the specific PDF with tarfile.open(tar_path, "r") as tar: pdf_bytes = tar.extractfile(f"{decision_id}.pdf").read() print(f"PDF size: {len(pdf_bytes):,} bytes") ``` --- ## Purpose This dataset is released to **democratize access to Azerbaijan's legal data** for: - **Training and fine-tuning LLMs** on Azerbaijani legal text — court decisions, case outcomes, legal terminology in both structured CSV and raw PDF format - **Building legal AI startups** — automated legal research, case outcome prediction, lawyer-case matching, document analysis, OCR pipelines - **Enabling RAG and Graph RAG applications** — the interconnected nature of courts, judges, cases, and decisions makes this ideal for retrieval-augmented generation and knowledge graph construction - **Academic research** — judicial analytics, legal system efficiency studies, comparative law research - **Legal tech innovation** — automating routine legal work, building intelligent case management systems, creating legal chatbots for Azerbaijani law - **Document AI** — 1.54M court decision PDFs for training document understanding, legal OCR, and PDF extraction models --- ## Dataset Contents ### Structured Data (CSVs) | File | Records | Size | Description | |------|---------|------|-------------| | `data/court_acts.csv` | 1,541,289 | ~250 MB | Court decisions with outcomes, case types, judges, dates (2016–2026) | | `data/court_cases.csv` | 67,877 | ~15 MB | Active/pending court cases — live docket snapshot | | `data/court_meetings.csv` | 29,921 | ~6 MB | Scheduled court hearings (Apr–Sep 2026) | | `data/courts.csv` | 116 | ~20 KB | Court registry with types, regions, and hierarchy | | `data/judges.csv` | 709 | ~160 KB | Judge registry with court assignments, bios, and demographics | | `data/lawyers.csv` | 2,232 | ~350 KB | Licensed lawyers with practice areas and experience | | `data/organizations.csv` | 70 | ~15 KB | Mediator organizations by region | **Total: 1,642,214 structured records across 7 datasets (~494 MB CSV)** ### Court Decision PDFs | Directory | Files | Total Size | Description | |-----------|-------|------------|-------------| | `pdfs/` | 1,541,218 | ~160 GB | Full-text court decision documents as PDF files | PDFs are stored as **tar archives by shard** (`000.tar` through `999.tar`). Each tar contains ~1,500 PDFs named by `decisionId` (e.g., `12345678.pdf`). Each tar is approximately 160 MB. ### Analysis Charts 30 business analysis charts in `charts/` directory (PNG, 150+ DPI) covering volume, trends, outcomes, regional analysis, and cross-dataset relationships. --- ## Entity Relationships & Schema The 7 datasets are interconnected. Understanding these relationships is critical for building AI applications over this data. ``` ┌─────────────┐ ┌──────────────┐ ┌───────────────┐ │ courts │ │ court_acts │ │ PDFs │ │ (116 courts) │────▶│ (1.54M acts) │────▶│ (1.54M files) │ │ │ │ │ │ │ │ id │ │ decisionId ──┼────▶│ {id}.pdf │ │ title ───────┼──┐ │ caseId │ │ in shard │ │ type_title │ │ │ caseNo │ │ {id%1000}.tar │ │ region_title │ │ │ caseType │ └───────────────┘ │ parent_court │ │ │ decisionType │ └─────────────┘ │ │ decisionDate │ ▲ │ │ court ───────┼── matches courts.title (after normalization) │ │ │ judge ───────┼── matches judges.full_name (strip oğlu/qızı) │ │ │ caseResult │ │ │ └──────────────┘ │ │ ┌──────┴──────┐ │ ┌──────────────┐ ┌────────────────┐ │ judges │ └─▶│ court_cases │ │ court_meetings │ │ (709 judges) │ │ (67K cases) │ │ (30K meetings) │ │ │ │ │ │ │ │ id │ │ id │ │ meetingId │ │ full_name │ │ caseNo ──────┼────▶│ caseId │ │ work ────────┼──── │ caseType │ │ caseType │ │ birthday │ │ caseStatus │ │ meetingType │ │ description │ │ court ───────┼──┐ │ meetingDate │ │ organization │ │ judge │ │ │ court │ │ experiences │ │ enterDate │ │ │ judge │ │ educations │ └──────────────┘ │ │ meetingStatus │ └─────────────┘ │ └────────────────┘ │ ┌─────────────┐ ┌──────────────┐ │ │ lawyers │ │organizations │ │ │ (2,232) │ │ (70 orgs) │ │ │ │ │ │ │ │ id │ │ id │ │ │ full_name │ │ company │ │ │ areas │ │ region_title─┼───┘ (same regions as courts) │ languages │ │ mediator_cnt │ │ duration │ └──────────────┘ │ institution │ └─────────────┘ ``` ### Join Keys | From | To | Join Strategy | |------|----|---------------| | `court_acts.decisionId` | PDF file | `{decisionId}.pdf` inside `pdfs/{decisionId % 1000}.tar` | | `court_acts.court` | `courts.title` | Normalize both: strip diacritics (ə→e, ı→i, ö→o, ü→u, ş→s, ç→c, ğ→g), lowercase, collapse whitespace | | `court_acts.judge` | `judges.full_name` | Normalize + strip patronymic suffix (oğlu/qızı): `"Abasov Qürur Bəybala oğlu"` → `"abasov qurur beybala"` matches `"Abasov Qürur Bəybala"` | | `judges.work` | `courts.title` | Same court name normalization as above | | `court_cases.court` | `courts.title` | Same normalization as above | | `court_meetings.court` | `courts.title` | Same normalization as above | | `court_cases.caseNo` | `court_acts.caseNo` | Direct string match — links active cases to their historical decisions | | `court_meetings.caseId` | `court_cases.id` | Direct integer match — links scheduled hearings to cases | | `courts.region_title` | `organizations.region_title` | Direct string match — links courts to mediator orgs in same region | ### Court Name Normalization Court names differ across datasets (ASCII transliteration vs full Unicode Azerbaijani). **You must normalize before joining:** ```python import unicodedata, re AZ_MAP = str.maketrans("əıöüşçğƏIÖÜŞÇĞ", "eiouscgEIOUSCG") def normalize_court_name(name: str) -> str: if not isinstance(name, str): return "" name = name.translate(AZ_MAP) name = unicodedata.normalize("NFKD", name) name = name.encode("ascii", "ignore").decode() name = re.sub(r"\s+", " ", name).strip().lower() return name # Example: # "Bakı Şəhəri Binəqədi Rayon Məhkəməsi" → "baki saheri bineqedi rayon mehkemesi" # "Baki Bineqedi Rayon Mehkemesi" → "baki bineqedi rayon mehkemesi" ``` --- ## Linking PDFs to CSV Records Each row in `court_acts.csv` has a `decisionId` that maps directly to a PDF file. The PDFs are sharded into 1,000 tar archives using the formula: ``` shard = decisionId % 1000 → zero-padded to 3 digits (e.g., 007, 042, 999) tar file = pdfs/{shard}.tar PDF filename inside tar = {decisionId}.pdf ``` ### Full Example: Load a court act row and its PDF ```python import pandas as pd import tarfile from huggingface_hub import hf_hub_download # 1. Load structured data acts = pd.read_csv( "hf://datasets/IsmatS/azerbaijan-court-data/data/court_acts.csv", encoding="utf-8-sig" ) # 2. Pick a decision row = acts.iloc[0] decision_id = row["decisionId"] print(f"Decision: {decision_id}") print(f"Case: {row['caseNo']} | Type: {row['caseType']}") print(f"Court: {row['court']} | Judge: {row['judge']}") print(f"Result: {row['caseResult']}") # 3. Compute shard and download the tar shard = str(int(decision_id) % 1000).zfill(3) tar_path = hf_hub_download( repo_id="IsmatS/azerbaijan-court-data", filename=f"pdfs/{shard}.tar", repo_type="dataset" ) # 4. Extract the PDF with tarfile.open(tar_path, "r") as tar: pdf_bytes = tar.extractfile(f"{int(decision_id)}.pdf").read() print(f"PDF: {len(pdf_bytes):,} bytes") ``` ### Batch PDF Processing ```python import tarfile from pathlib import Path from huggingface_hub import hf_hub_download def iter_pdfs_from_shard(repo_id: str, shard: int): """Yield (decision_id, pdf_bytes) for all PDFs in a shard.""" shard_str = str(shard).zfill(3) tar_path = hf_hub_download( repo_id=repo_id, filename=f"pdfs/{shard_str}.tar", repo_type="dataset" ) with tarfile.open(tar_path, "r") as tar: for member in tar.getmembers(): if member.name.endswith(".pdf"): decision_id = int(Path(member.name).stem) pdf_bytes = tar.extractfile(member).read() yield decision_id, pdf_bytes # Process all PDFs in shard 42 for did, pdf_data in iter_pdfs_from_shard("IsmatS/azerbaijan-court-data", 42): print(f" Decision {did}: {len(pdf_data):,} bytes") ``` --- ## Key Fields Reference ### court_acts.csv (1,541,289 rows — the core dataset) | Column | Type | Description | Example | |--------|------|-------------|---------| | `decisionId` | int | Unique decision ID — **links to PDF** | `5432109` | | `caseId` | int | Case ID | `1234567` | | `caseNo` | str | Human-readable case number | `2(2)-1234/2024` | | `caseType` | str | Case category | `Mülki işlər` (Civil) | | `decisionType` | str | Decision category | `Qətnamə` (Judgment) | | `decisionDate` | str | Date of decision (ISO format) | `2024-03-15` | | `court` | str | Court name (Azerbaijani) | `Bakı Şəhəri Xətai Rayon Məhkəməsi` | | `judge` | str | Judge name | `Mehdiyev Nəriman Hüseynqulu` | | `caseResult` | str | Outcome text (Azerbaijani) | `İddia təmin edildi` (Claim granted) | | `categoryName` | str | Subcategory (47.7% populated) | | | `caseCodes` | str | Case codes (99.9% empty — ignore) | | ### court_cases.csv (67,877 rows — live snapshot of open docket) | Column | Type | Description | |--------|------|-------------| | `id` | int | Case ID | | `caseNo` | str | Case number (matches court_acts.caseNo) | | `caseType` | str | Case category | | `caseStatus` | str | One of: `İcraatda` (In Proceedings, 84.9%), `Dayandırılıb` (Suspended, 13.4%), `Hakim təyin edilib` (Judge Assigned, 1.7%) | | `court` | str | Court name | | `judge` | str | Judge name | | `enterDate` | str | Filing date | ### court_meetings.csv (29,921 rows — future schedule Apr–Sep 2026) | Column | Type | Description | |--------|------|-------------| | `meetingId` | int | Meeting ID | | `caseId` | int | Linked case ID | | `caseType` | str | Case category | | `meetingType` | str | Hearing type (preparatory, oral, review, etc.) | | `meetingDate` | str | Scheduled datetime | | `court` | str | Court name | | `judge` | str | Judge name | | `meetingStatus` | str | `Təyin edilib` (Scheduled, 98.4%), `Keçirilməyib` (Not Held, 0.9%), `Ləğv edilib` (Cancelled, 0.7%) | ### courts.csv (116 rows — court registry) | Column | Type | Description | |--------|------|-------------| | `id` | int | Court ID | | `title` | str | Court name (canonical Azerbaijani) | | `type_title` | str | One of 7 types: Rayon (85), Heavy Crimes (6), Appeal (6), Military (6), Administrative (6), Commercial (6), Supreme (1) | | `region_title` | str | Geographic region | | `parent_court_title` | str | Appellate parent court | ### judges.csv (709 rows — judge registry) | Column | Type | Description | |--------|------|-------------| | `id` | int | Judge ID | | `full_name` | str | Full name with patronymic (e.g., `Abasov Qürur Bəybala oğlu`) | | `work` | str | Assigned court name — **key field for linking to courts** | | `description` | str | Role description (48.5% populated) | | `organization` | str | Organization affiliation (34.7% populated) | | `experiences` | str | Career experience, pipe-separated (43.9% populated) | | `educations` | str | Education history (14.5% populated) | | `birthday` | str | Date of birth (49.8% populated) | | `photo` | str | Photo URL | | `cover` | str | Cover text / title | **Note:** Judge names in `court_acts.csv` omit the patronymic suffix (`oğlu`/`qızı`). Strip this suffix when matching: `"Abasov Qürur Bəybala oğlu"` → `"Abasov Qürur Bəybala"`. After normalization, 636 of 709 registered judges (90%) match to court acts. ### lawyers.csv (2,232 rows) | Column | Type | Description | |--------|------|-------------| | `id` | int | Lawyer ID | | `full_name` | str | Full name | | `areas` | str | Practice areas, semicolon-separated (43.1% populated) | | `languages` | str | Languages spoken | | `duration` | str | Experience as string, e.g., `16 il` (16 years). Extract number with regex `(\d+)` | | `institution_title` | str | Bar association | ### organizations.csv (70 rows — mediator organizations) | Column | Type | Description | |--------|------|-------------| | `id` | int | Organization ID | | `company` | str | Organization name | | `region_title` | str | Region | | `mediator_count` | int | Number of mediators | | `voen` | str | Tax ID | --- ## Use Cases for AI Engineers ### 1. Retrieval-Augmented Generation (RAG) over Court Decisions Build a legal question-answering system that retrieves relevant court decisions and generates answers grounded in actual case law. ```python # Step 1: Extract text from PDFs import fitz # PyMuPDF def extract_text(pdf_bytes: bytes) -> str: doc = fitz.open(stream=pdf_bytes, filetype="pdf") return "\n".join(page.get_text() for page in doc) # Step 2: Chunk the text from langchain.text_splitter import RecursiveCharacterTextSplitter splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) # Step 3: Build an index with metadata from the CSV import pandas as pd acts = pd.read_csv("data/court_acts.csv", encoding="utf-8-sig") documents = [] for decision_id, pdf_bytes in iter_pdfs_from_shard("IsmatS/azerbaijan-court-data", 0): row = acts[acts["decisionId"] == decision_id].iloc[0] text = extract_text(pdf_bytes) chunks = splitter.split_text(text) for chunk in chunks: documents.append({ "text": chunk, "metadata": { "decisionId": int(decision_id), "caseNo": row["caseNo"], "caseType": row["caseType"], "court": row["court"], "judge": row["judge"], "decisionDate": row["decisionDate"], "caseResult": row["caseResult"], } }) # Step 4: Embed and store in a vector database # Use any embedding model — e.g., sentence-transformers, OpenAI, Cohere # Store in ChromaDB, Pinecone, Weaviate, Qdrant, etc. ``` ### 2. Knowledge Graph Construction (Graph RAG) The dataset has natural graph structure with rich interconnections: ``` Courts ──has_judge──▶ Judges ──decided──▶ Decisions ──has_pdf──▶ PDFs │ │ │ │ │ ├── caseType ├── type (Rayon, ├── assigned_to ├── decisionType │ Appeal, etc.) │ (court_cases) ├── caseResult │ │ └── decisionDate ├── region └── scheduled_for │ (court_meetings) └── parent_court (appellate hierarchy) ``` **Key graph facts:** - 709 registered judges + 1,040 unique judges in court decisions (636 overlap) - 192 judges (18.5%) serve multiple courts — these are key bridge nodes - 7 court types form a hierarchical structure (Rayon → Appeal → Supreme) - Each decision links to exactly one court, one judge, one case type, and one PDF ```python # Example: Build a NetworkX graph from court_acts import pandas as pd import networkx as nx acts = pd.read_csv("data/court_acts.csv", encoding="utf-8-sig", usecols=["decisionId", "court", "judge", "caseType", "caseResult"]) G = nx.Graph() for _, row in acts.iterrows(): G.add_edge(row["court"], row["judge"], relation="has_judge") G.add_edge(row["judge"], row["decisionId"], relation="decided") G.add_edge(row["decisionId"], row["caseType"], relation="case_type") print(f"Nodes: {G.number_of_nodes():,} | Edges: {G.number_of_edges():,}") ``` ### 3. Case Outcome Prediction (Classification) With 1.54M labeled decisions, train models to predict outcomes: ```python import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import LabelEncoder acts = pd.read_csv("data/court_acts.csv", encoding="utf-8-sig") # Filter to rows with known results labeled = acts[acts["caseResult"].notna()].copy() # Encode features le_type = LabelEncoder() le_court = LabelEncoder() le_result = LabelEncoder() labeled["caseType_enc"] = le_type.fit_transform(labeled["caseType"].fillna("")) labeled["court_enc"] = le_court.fit_transform(labeled["court"].fillna("")) labeled["result_enc"] = le_result.fit_transform(labeled["caseResult"]) X = labeled[["caseType_enc", "court_enc"]] y = labeled["result_enc"] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2) print(f"Train: {len(X_train):,} | Test: {len(X_test):,}") print(f"Classes: {len(le_result.classes_):,} unique outcomes") ``` ### 4. Legal AI Assistant / Fine-Tuning Create training data for fine-tuning an LLM on Azerbaijani legal text: ```python import pandas as pd acts = pd.read_csv("data/court_acts.csv", encoding="utf-8-sig") # Create instruction-following pairs from structured data training_examples = [] for _, row in acts.dropna(subset=["caseResult"]).iterrows(): training_examples.append({ "instruction": f"What was the outcome of {row['caseType']} case {row['caseNo']} " f"at {row['court']}?", "output": f"The case was decided by Judge {row['judge']} on {row['decisionDate']}. " f"Decision type: {row['decisionType']}. " f"Result: {row['caseResult']}." }) # For richer training data, combine with extracted PDF text # to create longer-form question-answer pairs ``` ### 5. Document AI & Legal OCR Use 1.54M court decision PDFs to train or evaluate document understanding models: ```python # Extract structured information from PDFs # Compare against CSV ground truth for evaluation import fitz import pandas as pd acts = pd.read_csv("data/court_acts.csv", encoding="utf-8-sig") # For each PDF, you have ground truth labels: # - caseNo (should appear in document header) # - court name (should appear in letterhead) # - judge name (should appear in signature) # - decisionDate (should appear in document) # - caseResult (should appear in verdict section) # This makes the dataset ideal for training information extraction models # or evaluating OCR accuracy on legal documents ``` ### 6. Court Analytics Dashboard ```python import pandas as pd acts = pd.read_csv("data/court_acts.csv", encoding="utf-8-sig") courts = pd.read_csv("data/courts.csv", encoding="utf-8-sig") # Decisions per court per year acts["year"] = pd.to_datetime(acts["decisionDate"], errors="coerce").dt.year volume = acts.groupby(["court", "year"]).size().reset_index(name="decisions") # Judge workload judge_load = acts.groupby("judge").agg( decisions=("decisionId", "count"), courts=("court", "nunique"), case_types=("caseType", "nunique") ).sort_values("decisions", ascending=False) print(judge_load.head(10)) ``` ### 7. Lawyer Matching Platform ```python import pandas as pd lawyers = pd.read_csv("data/lawyers.csv", encoding="utf-8-sig") # Parse practice areas (semicolon-separated) lawyers["area_list"] = lawyers["areas"].fillna("").str.split(";") lawyers["experience_years"] = lawyers["duration"].str.extract(r"(\d+)").astype(float) # Find lawyers specializing in criminal law with 10+ years experience criminal_lawyers = lawyers[ lawyers["area_list"].apply(lambda x: any("cinayət" in a.lower() for a in x)) & (lawyers["experience_years"] >= 10) ] print(f"Experienced criminal lawyers: {len(criminal_lawyers)}") ``` --- ## Dataset Statistics | Metric | Value | |--------|-------| | Total structured records | 1,642,214 | | Total PDFs | 1,541,218 | | PDF coverage | 99.995% (71 decisions had no PDF) | | Unique courts | 124 (in court_acts) | | Registered judges | 709 | | Unique judges in acts | 1,040 | | Judge registry overlap | 636 (90% of registry matched to acts) | | Multi-court judges | 192 (18.5%) | | Court acts date range | 2016–2026 (reliable from 2019+) | | Case types | Civil (46.9%), Admin Offenses (24.3%), Criminal (11.7%), Admin Disputes (9.8%), Commercial (3.6%) | | Top decision type | Qətnamə / Judgment (39.7%) | | Top outcome | İddia təmin edildi / Claim granted (20.6%) | | Busiest court | Bakı Şəhəri Xətai Rayon Məhkəməsi (64,933 decisions) | | Busiest judge | Mehdiyev Nəriman Hüseynqulu (9,543 decisions) | | Baku region share | 47.5% of all decisions | | YoY growth 2019→2025 | 4,400% (7,725 → 347,971) | --- ## Data Quality Notes | Issue | Details | Handling | |-------|---------|----------| | **Encoding** | UTF-8 with BOM (`utf-8-sig`) | Use `encoding="utf-8-sig"` when reading CSVs | | **Court name variation** | Names differ across datasets (ASCII vs Unicode Azerbaijani) | Normalize with diacritic stripping before joins (see code above) | | **Empty columns** | `court_meetings.parties` (100%), `court_meetings.caseCodes` (100%), `court_acts.caseCodes` (99.9%), `lawyers.services/description/achievement` (97–99%) | Ignore these columns | | **Temporal context** | `court_cases` = live snapshot (no closed cases), `court_meetings` = future schedule, `court_acts` = historical archive | Do not mix temporal semantics | | **Early years sparse** | 2016: 1 record, 2017: 104, 2018: 355 | Start trend analysis from 2019 | | **PDF coverage** | 1,541,218 of 1,541,289 decisions have PDFs (99.995%) | 71 decisions had no PDF attachment | | **Lawyer experience format** | `duration` is string like `"16 il"` (16 years) | Extract number with regex `(\d+)` | | **Lawyer practice areas** | Only 43.1% of lawyers have `areas` populated | Analyze available subset only | | **Meeting dates** | Some have `.1` suffix (e.g., `2026-05-19T09:20:00.1`) | Strip with regex before parsing | | **Scraping completeness** | 1 page of 108,589 failed (HTTP 400) — ~15 records | 99.999% coverage, negligible impact | --- ## Source All data scraped from the public API of [courts.gov.az](https://courts.gov.az) — the official website of the Azerbaijan Court System. Data is publicly available and released under CC-BY-4.0. --- ## Citation ```bibtex @dataset{samadov2026azerbaijan_court_data, title={Azerbaijan Court System Dataset}, author={Samadov, Ismat}, year={2026}, url={https://huggingface.co/datasets/IsmatS/azerbaijan-court-data}, note={1.64M structured records + 1.54M court decision PDFs from Azerbaijan court system} } ``` --- ## License CC-BY-4.0 — free to use for commercial and non-commercial purposes with attribution.
提供机构:
IsmatS
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作