five

mattismegevand/pitchfork

收藏
Hugging Face2023-08-13 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/mattismegevand/pitchfork
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en task_categories: - summarization - text-generation - question-answering tags: - music size_categories: - 10K<n<100K --- # Pitchfork Music Reviews Dataset This repository contains the code and dataset for scraping music reviews from Pitchfork. ## Dataset Overview The Pitchfork Music Reviews dataset is a collection of music album reviews from the Pitchfork website. Each entry in the dataset represents a single review and includes the following attributes: - `artist`: The artist of the album. - `album`: The name of the album. - `year_released`: The year the album was released. - `rating`: The rating given to the album by the reviewer. - `small_text`: A short snippet from the review. - `review`: The full text of the review. - `reviewer`: The name of the reviewer. - `genre`: The genre(s) of the album. - `label`: The record label that released the album. - `release_date`: The release date of the review. - `album_art_url`: The URL of the album art. ## Usage This dataset is publicly available for research. The data is provided 'as is', and you assume full responsibility for any legal or ethical issues that may arise from the use of the data. ## Scraping Process The dataset was generated by scraping the Pitchfork website. The Python script uses the `requests` and `BeautifulSoup` libraries to send HTTP requests to the website and parse the resulting HTML content. The script saves the data in an SQLite database and can also export the data to a CSV file. Duplicate entries are avoided by checking for existing entries with the same artist and album name before inserting new ones into the database. ## Potential Applications This dataset can be used for a variety of research purposes, such as: - Music information retrieval - Text mining and sentiment analysis - Music recommendation systems - Music trend analysis ## Acknowledgments The dataset is sourced from [Pitchfork](https://pitchfork.com/), a website that publishes daily reviews, features, and news stories about music. ## License Please ensure you comply with Pitchfork's terms of service before using or distributing this data.
提供机构:
mattismegevand
原始信息汇总

Pitchfork Music Reviews Dataset 概述

数据集基本信息

  • 许可证: MIT
  • 语言: 英语
  • 任务类别:
    • 摘要生成
    • 文本生成
    • 问答
  • 标签: 音乐
  • 大小类别: 10K<n<100K

数据集内容

  • 艺术家: 专辑的艺术家
  • 专辑: 专辑名称
  • 发行年份: 专辑发行年份
  • 评分: 专辑的评分
  • 简短文本: 评论的简短片段
  • 评论全文: 完整的评论文本
  • 评论者: 评论者的名字
  • 流派: 专辑的流派
  • 唱片公司: 发行专辑的唱片公司
  • 评论发布日期: 评论的发布日期
  • 专辑封面URL: 专辑封面的URL

数据集用途

  • 音乐信息检索
  • 文本挖掘与情感分析
  • 音乐推荐系统
  • 音乐趋势分析

数据来源

  • 数据来源于 Pitchfork 网站,该网站每日发布音乐评论、特写和新闻。
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作