modernlegal/spembeddings
收藏Hugging Face2025-04-03 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/modernlegal/spembeddings
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了来源数据库(source_db)、文档ID(document_id)、文档标题(document_title)、文档日期(document_date)、文档URL(document_url)、文档类型(kind)、文档文本(text)、额外信息(extra)以及文本嵌入(embedding)等字段。数据集分为训练集(train),共有3306个示例,大小为44222KB。数据集适用于文本分类和文本摘要任务,语言为斯洛文尼亚语(sl),主题标签为法律(legal)。数据集规模在1K到10K之间。
The dataset includes fields such as source database (source_db), document ID (document_id), document title (document_title), document date (document_date), document URL (document_url), document type (kind), document text (text), extra information (extra), and text embedding (embedding). The dataset is split into a training set (train) with a total of 3,306 examples, sized at 44,222KB. The dataset is suitable for text classification and summarization tasks, the language is Slovenian (sl), and the thematic tag is legal. The dataset size is between 1K and 10K.
提供机构:
modernlegal



