taide/TAIDE-14-tasks
收藏Hugging Face2023-10-26 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/taide/TAIDE-14-tasks
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-4.0
task_categories:
- text-generation
- question-answering
- conversational
language:
- zh
- en
tags:
- gpt4
size_categories:
- n<1K
---
# Dataset Card for TAIDE-14-tasks
### Dataset Summary
The "TAIDE-14-tasks" dataset, derived from the TAIDE project, encompasses 14 prevalent text generation tasks. This dataset features a collection of 140 prompts tailored for assessing Traditional Chinese Large Language Models (LLM). GPT-4 meticulously crafted these prompts using the provided task, domain, and keywords from the instructions, with further validation by human experts. Each data entry not only contains the main content but also offers both positive and negative reference responses. These positive and negative reference responses are generated by GPT-4 and then manually proofread to ensure accuracy and relevance. For those keen on evaluating LLMs, we advocate for the G-Eval methodology.
Topics Covered (50):
```
{'人類學和社會學', '心理學和心理健康', '心靈和身心健康', '生物學和生物技術', '地理和地球科學',
'老年人和長者議題', '汽車和交通', '宗教和信仰', '法律和法規', '社區和社會發展',
'社會和文化議題', '社群媒體和網路文化', '青少年和成年人生活', '品牌和行銷', '建築和設計'
'政治和國際關係', '科技和人工智慧', '科學和探索', '音樂和音樂創作', '飛行和航空業',
'家庭和家居裝潢', '家庭和親密關係', '旅遊和探險', '時尚和美容', '時尚和配件',
'神秘學和靈性', '財務和投資', '假日和節日', '動物和動物保護', '教育和學習',
'飲食和烹飪', '傳媒和新聞業', '愛情和浪漫', '節能和可持續發展', '經濟和商業',
'資料和統計學', '遊戲和遊戲設計', '電子商務和網路交易', '電影和電影產業', '慈善和志願服務',
'演講和公共演講', '網路安全和隱私', '歷史和文化遺產', '環境和氣候變化', '職業和就業市場',
'職業道德和商業道德', '醫學和健康', '寵物和動物福利', '藝術和文學', '體育和健身'}
```
Tasks Included (14):
```
{'開放式生成', '分類', '問答', '摘要任務', '寫作', '翻譯', '文本分析', '常識推理', '寫信', '抽取', '推薦', '情緒分析', '提供建議', '對話生成'}
```
### Language
* Predominantly in Traditional Chinese.
* Some portions in English.
## Dataset Structure
### Data Columns
* Topic
* Task
* Keywords
* Prompt
* Input
* Positive Response
* Negative Response
### Data Instances
```
{
'Topic': '科技和人工智慧',
'Task': '寫信',
'Keywords': '如何?',
'Prompt': '假設你是一名人工智慧專家,一家公司希望了解如何將AI技術引入他們的產品。請寫一封回信,解釋如何將AI技術整合到他們的業務中。',
'Input': '<noinput>',
'Positive Response': '親愛的[公司名稱]:
非常感謝您的來信,我很高興能為您提供有關將AI技術整合到您的產品中的建議。在當今的科技世界中,人工智慧(AI)已經成為許多成功創新產品的核心驅動力之一。
首先,我建議您分析哪些部分的業務流程可以透過使用AI技術得到改進。這可以包括自動化的客戶服務,資料分析和預測,產品推薦等。確定這些範疇後,可以開始研究適用於每個特定問題的AI解決方案。
接下來,您需要建立一個專業的AI團隊,包括資料科學家、機器學習工程師、軟體開發人員等。此外,您可能需要提供相應的培訓,以確保團隊成員具備實現AI整合的技能。
在技術層面上,您可以選擇開源的機器學習框架,如TensorFlow或PyTorch,以編寫和設計自己的機器學習模型。此外,您還可以使用現有的API和AI平台,例如Google Cloud AI或Microsoft Azure AI,這些平台提供了大量預先設計的AI解決方案。
最後,將AI技術整合到產品中的過程應該是可持續且不斷優化的。確保持續收集資料,以便為機器學習模型提供更多資料,並不斷改進其性能。
希望我的建議對您有所幫助。如果您有任何進一步的疑問或需要專業建議,請隨時與我聯繫。
祝商祺!
[您的名字]',
'Negative Response': '親愛的[公司名稱]:
嗨,謝謝你的信。把AI加到你的業務裡就是找個電腦高手,讓他們給你做個程式,然後看看有沒有什麼好玩的功能可以加進去,就這麼簡單。
祝好運!
[您的名字]',
}
```
## Licensing Information
The dataset is available under the [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
license: cc-by-nc-4.0
task_categories:
- text-generation
- question-answering
- conversational
language:
- zh
- en
tags:
- gpt4
size_categories:
- n<1K
# Dataset Card for TAIDE-14-tasks
### Dataset Summary
The "TAIDE-14-tasks" dataset, derived from the TAIDE project, covers 14 mainstream text generation tasks. This dataset includes 140 prompts specifically designed for evaluating Traditional Chinese Large Language Models (LLMs). GPT-4 meticulously curated these prompts using the provided task, domain, and keywords from the instructions, with further validation conducted by human experts. Each data entry not only contains the core content but also provides both positive and negative reference responses. These responses are generated by GPT-4 and then manually proofread to ensure accuracy and relevance. For researchers and practitioners aiming to evaluate LLMs, we recommend adopting the G-Eval methodology.
Topics Covered (50):
"Anthropology and Sociology", "Psychology and Mental Health", "Spirituality and Physical & Mental Well-being", "Biology and Biotechnology", "Geography and Earth Sciences",
"Elderly and Senior Issues", "Automobiles and Transportation", "Religion and Beliefs", "Law and Regulations", "Community and Social Development",
"Social and Cultural Issues", "Social Media and Cyberculture", "Adolescent and Adult Life", "Branding and Marketing", "Architecture and Design",
"Politics and International Relations", "Technology and Artificial Intelligence", "Science and Exploration", "Music and Musical Composition", "Aviation and Aerospace Industry",
"Home and Home Decoration", "Family and Intimate Relationships", "Travel and Exploration", "Fashion and Beauty", "Fashion and Accessories",
"Occult and Spirituality", "Finance and Investment", "Holidays and Festivals", "Animals and Animal Protection", "Education and Learning",
"Food and Cooking", "Media and Journalism", "Love and Romance", "Energy Conservation and Sustainable Development", "Economy and Business",
"Data and Statistics", "Games and Game Design", "E-commerce and Online Transactions", "Film and Film Industry", "Charity and Volunteer Service",
"Public Speaking", "Cybersecurity and Privacy", "History and Cultural Heritage", "Environment and Climate Change", "Occupation and Employment Market",
"Professional Ethics and Business Ethics", "Medicine and Health", "Pets and Animal Welfare", "Art and Literature", "Sports and Fitness"
Tasks Included (14):
"Open-ended Generation", "Classification", "Question Answering", "Summarization Task", "Writing", "Translation", "Text Analysis", "Common Sense Reasoning", "Letter Writing", "Extraction", "Recommendation", "Sentiment Analysis", "Providing Advice", "Conversational Generation"
### Language
* Predominantly in Traditional Chinese.
* Portions of the dataset are in English.
## Dataset Structure
### Data Columns
* Topic
* Task
* Keywords
* Prompt
* Input
* Positive Response
* Negative Response
### Data Instances
{
'Topic': 'Technology and Artificial Intelligence',
'Task': 'Letter Writing',
'Keywords': 'How?',
'Prompt': 'Assume you are an AI expert. A company wants to learn how to integrate AI technology into their products. Please write a reply explaining how to incorporate AI technology into their business operations.',
'Input': '<noinput>',
'Positive Response': 'Dear [Company Name]:
Thank you very much for your letter. I am delighted to provide you with advice on integrating AI technology into your products. In today's technological landscape, artificial intelligence (AI) has become one of the core driving forces behind many successful innovative products.
First, I recommend that you analyze which parts of your business processes can be improved using AI technology. This may include automated customer service, data analysis and forecasting, product recommendations, and more. Once these areas are identified, you can begin researching AI solutions applicable to each specific problem.
Next, you will need to establish a professional AI team, including data scientists, machine learning engineers, software developers, and other relevant roles. Additionally, you may need to provide corresponding training to ensure team members possess the skills required to implement AI integration.
From a technical perspective, you can choose open-source machine learning frameworks such as TensorFlow or PyTorch to build and design your own machine learning models. You can also utilize existing AI APIs and platforms, such as Google Cloud AI or Microsoft Azure AI, which offer a wide range of pre-designed AI solutions.
Finally, the process of integrating AI technology into your products should be sustainable and continuously optimized. Ensure continuous data collection to provide more training data for your machine learning models and continually improve their performance.
I hope my suggestions are helpful to you. If you have any further questions or require professional advice, please do not hesitate to contact me.
Best regards,
[Your Name]',
'Negative Response': 'Dear [Company Name]:
Hi, thanks for your letter. Adding AI to your business is just finding a computer expert, having them make a program for you, and then seeing if there are any fun features to add, that's all there is to it.
Good luck!
[Your Name]',
}
## Licensing Information
The dataset is available under the [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode).
提供机构:
taide
原始信息汇总
TAIDE-14-tasks 数据集概述
数据集简介
"TAIDE-14-tasks" 数据集源自TAIDE项目,包含14种常见的文本生成任务。该数据集包含140个专门为评估传统中文大型语言模型(LLM)设计的提示。这些提示由GPT-4根据任务、领域和关键词精心制作,并经人类专家验证。每个数据条目不仅包含主要内容,还提供正面和负面的参考响应,这些响应由GPT-4生成并经过人工校对以确保准确性和相关性。
语言
- 主要使用繁体中文。
- 部分内容使用英文。
数据集结构
数据列
- Topic: 主题
- Task: 任务类型
- Keywords: 关键词
- Prompt: 提示
- Input: 输入
- Positive Response: 正面响应
- Negative Response: 负面响应
数据实例
json { Topic: 科技和人工智慧, Task: 寫信, Keywords: 如何?, Prompt: 假設你是一名人工智慧專家,一家公司希望了解如何將AI技術引入他們的產品。請寫一封回信,解釋如何將AI技術整合到他們的業務中。, Input: <noinput>, Positive Response: 親愛的[公司名稱]:..., Negative Response: 親愛的[公司名稱]:... }
许可信息
搜集汇总
数据集介绍

构建方式
在大型语言模型评估领域,TAIDE-14-tasks数据集的构建体现了严谨的工程化流程。该数据集源自TAIDE项目,其核心内容由GPT-4模型根据明确的任务指令、领域范畴及关键词自动生成初始提示。为确保内容的准确性与相关性,所有生成的提示均经过了领域专家的严格人工校验与修订。每个数据实例不仅包含精心设计的提示,还提供了由GPT-4生成并经过人工校对的正面与负面参考回答,从而构建了一个包含14种主流文本生成任务、涵盖50个广泛主题的高质量评估基准。
特点
该数据集在评估资源中展现出鲜明的特色。其内容以繁体中文为主,辅以部分英文,专门服务于传统中文大型语言模型的评测需求。数据集结构清晰,每个实例均包含主题、任务、关键词、提示、输入及正负参考回答等多个维度,为模型性能提供了多角度的分析锚点。尤为突出的是,其覆盖的14项任务从开放式生成、问答、翻译到情感分析、推荐等,横跨了文本生成与理解的核心场景,而50个主题则广泛涉及科技、人文、社会、经济等多个领域,确保了评估的广度与生态效度。
使用方法
对于致力于模型评估的研究者而言,该数据集提供了标准化的使用路径。用户可依据特定的任务类别或主题领域,提取相应的提示与参考回答,对目标语言模型进行生成能力测试。在评估过程中,建议采用G-Eval等先进的自动化评估框架,将模型的生成输出与数据集提供的正负参考回答进行多维度对比分析,从而量化模型在内容相关性、事实准确性、语言流畅性及任务遵循度等方面的表现。这种使用方法旨在推动客观、可复现的模型性能评测。
背景与挑战
背景概述
TAIDE-14-tasks数据集源自TAIDE项目,专注于评估传统中文大语言模型(LLM)的性能。该数据集由GPT-4精心设计,并经过人类专家验证,涵盖了14种主流文本生成任务,包括开放生成、问答、翻译等,涉及人类学、科技、法律等50个主题领域。其核心研究问题在于如何系统性地评估LLM在复杂中文语境下的生成质量与逻辑一致性,为自然语言处理领域提供了标准化的评估基准,推动了中文LLM研究的发展。
当前挑战
该数据集旨在解决传统中文大语言模型评估中的多任务泛化能力挑战,特别是在开放生成、对话生成等复杂场景下,模型需兼顾语言流畅性、领域知识准确性与文化语境适应性。构建过程中,挑战主要集中于提示工程的设计,需确保GPT-4生成的提示涵盖多样主题与任务类型,同时通过人工校对保证正负参考响应的质量,以克服自动生成数据可能存在的偏差与噪声问题。
常用场景
经典使用场景
在大型语言模型评估领域,TAIDE-14-tasks数据集以其精心设计的14项文本生成任务,为研究者提供了标准化的测试平台。该数据集涵盖开放生成、问答、翻译、摘要等多样化任务,并融合了传统中文语境,特别适用于评估模型在复杂语言环境下的综合表现。通过GPT-4生成并经人工校验的正负参考回答,研究者能系统性地检验模型在内容准确性、逻辑连贯性及文化适配性等方面的能力,为模型优化提供精准的参照依据。
实际应用
在产业实践中,该数据集可作为企业筛选适配传统中文场景AI解决方案的关键工具。教育机构能借助其分类任务设计课程评估体系,内容平台则可依据其文本分析模块优化信息过滤机制。政府部门在制定语言科技政策时,亦能参照该数据集对模型伦理合规性进行前置测试,实现技术应用与社会价值的协同发展。
衍生相关工作
该数据集已催生系列重要研究,例如基于其正负响应对比的生成质量评估框架构建,以及跨任务迁移学习方法的探索。部分学者利用其多领域标签体系,开发了细粒度模型能力诊断工具;另有工作结合G-Eval方法论,建立了传统中文生成任务的自动化评估管道,显著提升了相关学术研究的可复现性与比较基准的统一性。
以上内容由遇见数据集搜集并总结生成



