five

Steam Games, Reviews, and Rankings.

收藏
www.kaggle.com2024-09-23 更新2025-01-09 收录
下载链接:
https://www.kaggle.com/mohamedtarek01234/steam-games-reviews-and-rankings
下载链接
链接失效反馈
官方服务:
资源简介:
# **Steam Games, Reviews, and Rankings Dataset** This dataset contains over 990,000 rows of data scraped from the Steam platform, focusing on game reviews, rankings, and game-related information across various genres. The data was collected from the top 40 games in sales, revenue, and reviews within six core genres on Steam. The dataset includes 242 games for player reviews and 290 games for genre rankings and descriptions, with some games excluded due to content restrictions (nudity). ## ## **Data Collection Process:** Data was scraped from Steam's public pages, following Steam’s robots.txt file at the time to ensure compliance with Steam's scraping guidelines. The dataset includes information from the top 40 games ranked in sales, revenue, and reviews across the following genres: - Action - Adventure - Role-Playing - Strategy - Simulation - Sports & Racing A total of 990,000+ reviews were collected from 242 games, while the game description and genre rankings files contain 290 games. The difference in the number of games is due to the exclusion of certain games containing nudity, from which reviews were not collected. ## **Data Files:** The dataset is divided into three files, each containing different types of information: ###**games_description.csv:** This file contains detailed descriptions of the games. **Columns:** - name: Game title - short_description: Brief description of the game - long_description: Detailed description of the game - genres: List of genres the game belongs to - minimum_system_requirement: Minimum system requirements to run the game - recommend_system_requirement: Recommended system requirements - release_date: Game release date - developer: Developer of the game - publisher: Publisher of the game - overall_player_rating: Overall player rating for the game - number_of_reviews_from_purchased_people: Number of reviews from users who purchased the game - number_of_english_reviews: Number of reviews written in English - link: URL to the game page on Steam ### **steam_game_reviews.csv:** This file contains player reviews for the games. **Columns**: - review: The content of the player’s review - hours_played: Total hours the player has spent on the game - helpful: Number of users who found the review helpful - funny: Number of users who found the review funny - recommendation: Whether the player recommended or did not recommend the game - date: Date of the review - game_name: Name of the game being reviewed - username: Username of the player who wrote the review ###**games_ranking.csv**: This file contains the ranking of games by genre. **Columns:** - game_name: Name of the game - genre: Genre of the game - rank_type: Type of ranking (sales, revenue, or reviews) - rank: Game’s position within the ranking ## **Source Code:** The source code used to scrape this data will be made available in a public GitHub repository. You can access the code and instructions for reproducing the dataset here:[ https://github.com/mohamedtarek132/Steam-Game-Reviews-Scraper](url).

本数据集源自 Steam 平台,汇聚了超过 99 万条数据,内容涵盖游戏评论、排名及各类游戏相关信息,涉及众多流派。数据收集自 Steam 平台上销售额、收入和评论排名前 40 的游戏,涵盖了六个核心流派。数据集包括 242 款游戏的玩家评论以及 290 款游戏的流派排名与描述,部分游戏因内容限制(涉及裸露)而被排除在外。 **数据收集流程:** 数据通过爬取 Steam 的公共页面收集而来,遵循当时的 Steam robots.txt 文件,以确保符合 Steam 的爬取规范。 数据集包含以下流派中销售额、收入和评论排名前 40 的游戏信息: - 动作 - 冒险 - 角色扮演 - 策略 - 模拟 - 体育与赛车 总计收集了来自 242 款游戏的 99 万多条评论,而游戏描述及流派排名文件中包含了 290 款游戏。游戏数量差异是由于某些包含裸露内容的游戏被排除,因此未收集其评论。 **数据文件:** 数据集分为三个文件,每个文件包含不同类型的信息。 ### **games_description.csv:** 该文件包含了游戏的详细描述。 **列:** - name:游戏名称 - short_description:游戏简要描述 - long_description:游戏详细描述 - genres:游戏所属流派列表 - minimum_system_requirement:运行游戏所需最低系统要求 - recommend_system_requirement:推荐系统要求 - release_date:游戏发布日期 - developer:游戏开发者 - publisher:游戏发行商 - overall_player_rating:游戏整体玩家评分 - number_of_reviews_from_purchased_people:购买用户发表的评论数量 - number_of_english_reviews:英文评论数量 - link:Steam 上游戏页面的 URL ### **steam_game_reviews.csv:** 该文件包含了玩家的游戏评论。 **列:** - review:玩家评论内容 - hours_played:玩家游玩总时长 - helpful:认为评论有帮助的用户数量 - funny:认为评论有趣的用户数量 - recommendation:玩家是否推荐该游戏 - date:评论日期 - game_name:被评论的游戏名称 - username:发表评论的玩家用户名 ### **games_ranking.csv:** 该文件包含了按流派对游戏的排名。 **列:** - game_name:游戏名称 - genre:游戏流派 - rank_type:排名类型(销售额、收入或评论数) - rank:游戏在排名中的位置 **源代码:** 用于爬取此数据的源代码将在一个公开的 GitHub 仓库中提供。您可以通过以下链接访问代码和复制数据集的说明:[https://github.com/mohamedtarek132/Steam-Game-Reviews-Scraper](https://github.com/mohamedtarek132/Steam-Game-Reviews-Scraper)。
提供机构:
Kaggle
搜集汇总
背景与挑战
背景概述
该数据集包含从Steam平台抓取的超过990,000行数据,覆盖动作、冒险等六个核心游戏类型,基于销售、收入和评论排名选取前40名游戏。数据分为三个结构化文件:游戏描述、玩家评论和游戏排名,提供游戏详情、评论内容和排名信息,数据收集遵循平台指南并排除含裸露内容的游戏。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作