five

argilla/news-summary

收藏
Hugging Face2023-03-16 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/argilla/news-summary
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en license: - cc-by-nc-4.0 size_categories: - 10K<n<100K source_datasets: - original task_categories: - summarization task_ids: - news-articles-summarization dataset_info: features: - name: text dtype: string - name: prediction list: - name: score dtype: float64 - name: text dtype: string - name: prediction_agent dtype: string - name: annotation dtype: 'null' - name: annotation_agent dtype: 'null' - name: id dtype: string - name: metadata dtype: 'null' - name: status dtype: string - name: event_timestamp dtype: timestamp[us] - name: metrics struct: - name: text_length dtype: int64 splits: - name: train num_bytes: 2563132.0446374374 num_examples: 1000 - name: test num_bytes: 52331466.955362566 num_examples: 20417 download_size: 33207109 dataset_size: 54894599.0 --- # Dataset Card for "news-summary" ## Dataset Description - **Homepage:** Kaggle Challenge - **Repository:** https://www.kaggle.com/datasets/clmentbisaillon/fake-and-real-news-dataset?select=True.csv - **Paper:** N.A. - **Leaderboard:** N.A. - **Point of Contact:** N.A. ### Dataset Summary Officially it was supposed to be used for classification but, can you use this data set to summarize news articles? ### Languages english ### Citation Information Acknowledgements Ahmed H, Traore I, Saad S. “Detecting opinion spams and fake news using text classification”, Journal of Security and Privacy, Volume 1, Issue 1, Wiley, January/February 2018. Ahmed H, Traore I, Saad S. (2017) “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques. In: Traore I., Woungang I., Awad A. (eds) Intelligent, Secure, and Dependable Systems in Distributed and Cloud Environments. ISDDC 2017. Lecture Notes in Computer Science, vol 10618. Springer, Cham (pp. 127-138). ### Contributions Thanks to [@davidberenstein1957](https://github.com/davidberenstein1957) for adding this dataset.
提供机构:
argilla
原始信息汇总

数据集概述

基本信息

  • 语言: 英语
  • 许可证: CC-BY-NC-4.0
  • 大小: 10K<n<100K
  • 数据源: 原始数据
  • 任务类别: 摘要生成
  • 任务ID: news-articles-summarization

数据集特征

  • text: 字符串类型
  • prediction: 列表类型,包含
    • score: 浮点数类型
    • text: 字符串类型
  • prediction_agent: 字符串类型
  • annotation: 空值类型
  • annotation_agent: 空值类型
  • id: 字符串类型
  • metadata: 空值类型
  • status: 字符串类型
  • event_timestamp: 时间戳类型
  • metrics: 结构类型,包含
    • text_length: 整数类型

数据集拆分

  • 训练集:
    • 大小: 2563132.0446374374字节
    • 样本数: 1000
  • 测试集:
    • 大小: 52331466.955362566字节
    • 样本数: 20417

下载与数据集大小

  • 下载大小: 33207109字节
  • 数据集大小: 54894599.0字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作