Stack Overflow Data Licensing
收藏Snowflake2025-06-02 更新2025-06-03 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZ1M6ZJR0BV
下载链接
链接失效反馈官方服务:
资源简介:
# **Stack Overflow Dataset**
Millions of the world's developers and technologists visit Stack Overflow to ask questions, learn, and share technical knowledge, making it the most complete, accurate source of human-verified technical knowledge on the internet:
- 16+ years of accurate, high-quality, and trusted technical knowledge.
- 60 million+ questions and answers to date.
- 69,000 technology tags used to organize content.
- 92% of developers visit Stack Overflow regularly.
Improve the performance of your chatbots, agents, and other AI solutions with Stack Overflow’s community-validated data:
- Strict moderation policies and rich feedback signals from Stack Overflow’s users and moderators provide a reliable source of truth.
- Top-class technical expertise and experience, expressed in natural language, is ideal for LLM training, improving RAG performance, and more.
- [145+ Stack Exchange sites](https://stackexchange.com/) across a range of topics—including software engineering, math, DIY, and more—support fine-tuning.
The entire Stack Overflow corpus or a tailored subset is available with options that suit your specific needs. Reach out to us for more information.
# Data Dictionary
## Tables
- **Comments** <br/>Comments on Questions and Answers on a given Stack Exchange Site<br/>
- **PostLinks**<br/>Link to Posts from Stack Exchange sites to facilitate attribution of content<br/>
- **Posts**<br/>Questions and Answers to Questions on a Stack Exchange Site<br/>
- **PostHistory**<br/>Content changes to Questions and Answers on a Stack Exchange Site<br/>
- **Tags**<br/>Tags on a Stack Exchange Site<br/>
- **Votes**<br/>Votes on Posts on a Stack Exchange Site
# How to get access to CKE
- Click the "Request" button on this listing
- In your request description, please provide all details that are relevant for your interest in this listing
- Once submitted a team member from Stack Overflow will reach out to go through the next steps of getting access to the CKE.
## Sample Questions for CKE
- How do I find the nth weekday of the month using chrono?
- What's the point of deleted virtual functions?
- Why do I need another pair of curly braces when using a variable in a format specifier in Python f-strings?
提供机构:
Stack Overflow
创建时间:
2025-05-20
原始信息汇总
Stack Overflow Knowledge Solutions 数据集概述
数据集简介
- 包含16年以上准确、高质量、可信的技术知识
- 涵盖6000万+技术问答数据
- 使用69,000个技术标签进行内容组织
- 92%的开发人员定期访问Stack Overflow
主要优势
- 严格的审核政策和丰富的用户反馈信号提供可靠的真实来源
- 顶级技术专业知识和自然语言表达,适合LLM训练和RAG性能提升
- 涵盖145+个Stack Exchange站点,包括软件工程、数学、DIY等多个主题
数据字典
表结构
| 表名 | 描述 |
|---|---|
| Comments | Stack Exchange站点上问题和答案的评论 |
| PostLinks | Stack Exchange站点间的内容引用链接 |
| Posts | Stack Exchange站点上的问题和答案 |
| PostHistory | 问题和答案的内容变更历史 |
| Tags | Stack Exchange站点上的标签 |
| Votes | 帖子的投票数据 |
商业应用场景
- LLM微调和预训练
- 检索增强生成(RAG)
- 代理系统开发
- 搜索和知识图谱增强
技术特性
- 更新频率:每月
- 时间覆盖范围:2008年9月15日之后
- 云区域可用性:支持Azure多个区域(包括澳大利亚东部、加拿大中部、印度中部等15+区域)
访问方式
- 点击列表中的"Request"按钮
- 在请求描述中提供相关详细信息
- Stack Overflow团队将联系您完成后续访问步骤
联系信息
- 销售:oapi-managers@stackoverflow.com
- 支持:api-support@stackoverflow.com
分类标签
- Cortex AI Ready
- Cortex Knowledge Extensions



