pc-components-reviews
收藏数据集概述
数据集信息
- 数据集名称: pc-components-reviews
- 数据集大小: 657104 字节
- 下载大小: 115118 字节
- 样本数量: 200
- 配置名称: default
- 数据文件路径: data/train-*
数据集特征
- instruction: 字符串类型
- sentiment: 字符串类型
- gender: 字符串类型
- age_group: 字符串类型
- nationality: 字符串类型
- expertise: 字符串类型
- language: 字符串类型
- text: 字符串类型
- distilabel_metadata: 结构体类型
- raw_input_text_classification_0: 列表类型
- content: 字符串类型
- role: 字符串类型
- raw_output_text_classification_0: 字符串类型
- raw_input_text_classification_0: 列表类型
- model_name: 字符串类型
- labels: 字符串类型
数据集结构
- 配置: default
- 示例结构: json { "age_group": "senior", "distilabel_metadata": { "raw_input_text_classification_0": [ { "content": "You are an AI system specialized in generating labels to classify pieces of text. Your sole purpose is to analyze the given text and provide appropriate classification labels.", "role": "system" }, { "content": "# Instruction Please classify the user query by assigning the most appropriate labels. Do not explain your reasoning or provide any additional commentary. If the text is ambiguous or lacks sufficient information for classification, respond with "Unclassified". Provide the label that best describes the text. You are an AI system specialized in assigning sentiment to customer reviews.
Labeling the user input
Use the available labels to classify the user query: available_labels = [ "positive", "negative", "neutral", ]
User Query
I recently upgraded my system with the AMD Ryzen 9 5900X and the EVGA GeForce RTX 3080, and the performance boost is noticeable, especially in 4K gaming. The MSI MPG B550 Gaming Edge WiFi motherboard is stable, but I did have to tweak the BIOS settings a bit for optimal performance.
Output Format
Now, please give me the labels in JSON format, do not include any other text in your response:
{ "labels": "label" } ", "role": "user" } ], "raw_output_text_classification_0": "{ "labels": "positive" }" }, "expertise": "expert", "gender": "female", "instruction": "Create a highly realistic and detailed neutral customer review or comment for a PC components forum. The review should be in English. The reviewer has the following characteristics: - Gender: female - Age group: senior - Nationality: Japanese - Expertise level: expert Focus on: 1. Authenticity: Write as this specific user would, considering their background and expertise level. Use natural language and occasional typos. 2. Specificity: Mention specific PC components, brands, or technical details appropriate to their expertise level. 3. Diversity: Vary the topics, writing styles, and lengths (from short to medium). 4. Originality: Avoid repetitive phrases or common review clichés. 5. Emotion: Express neutral feelings clearly, but subtly. 6. Cultural nuances: Incorporate subtle hints of their nationality in language use or references, if appropriate. 7. Format: Write only the review content. Do not include titles, headers, usernames, ratings, or any structured elements. 8. Length: Aim for 1-3 sentences, like a forum comment or short review. 9. Start naturally: Begin the review as if youre in the middle of a conversation, without any introductory phrases or titles. 10. Language: Ensure the review is written in English, regardless of the reviewers nationality but add nuances when its not a native speaker. Example topics: CPUs, GPUs, motherboards, RAM, cooling systems, cases, power supplies, or overall PC builds. Remember: Start directly with the review content, keeping in mind the users background and expertise level. The review must be in English.", "labels": "positive", "language": "English", "model_name": "Qwen/Qwen2.5-72B-Instruct", "nationality": "Japanese", "sentiment": "neutral", "text": "I recently upgraded my system with the AMD Ryzen 9 5900X and the EVGA GeForce RTX 3080, and the performance boost is noticeable, especially in 4K gaming. The MSI MPG B550 Gaming Edge WiFi motherboard is stable, but I did have to tweak the BIOS settings a bit for optimal performance." }
数据集加载
-
加载方式: python from datasets import load_dataset
ds = load_dataset("argilla/pc-components-reviews")




