WiktorS/polish-news
收藏Hugging Face2023-06-05 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/WiktorS/polish-news
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-classification
- summarization
- text-generation
language:
- pl
size_categories:
- 100K<n<1M
---
This dataset contains more than 250k articles obtained from polish news site `tvp.info.pl`.
Main purpouse of collecting the data was to create a transformer-based model for text summarization.
Columns:
* `link` - link to article
* `title` - original title of the article
* `headline` - lead/headline of the article - first paragraph of the article visible directly from the page
* `content` - full textual contents of the article
Link to original repo: https://github.com/WiktorSob/scraper-tvp
Download the data:
```python
from datasets import load_dataset
dataset = load_dataset("WiktorS/polish-news")
```
提供机构:
WiktorS
原始信息汇总
数据集概述
基本信息
- 许可证:Apache 2.0
- 任务类别:文本分类、摘要生成、文本生成
- 语言:波兰语
- 数据量:100K<n<1M
数据内容
- 来源:从波兰新闻网站
tvp.info.pl获取的超过25万篇文章 - 收集目的:用于创建基于transformer的文本摘要模型
数据结构
- 列信息:
link:文章链接title:文章标题headline:文章导语/头条(文章的第一段)content:文章的完整文本内容



