StaQC
收藏魔搭社区2025-07-27 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/OmniData/StaQC
下载链接
链接失效反馈官方服务:
资源简介:
displayName: StaQC
license:
- CC BY 4.0
paperUrl: https://arxiv.org/pdf/1803.09371v1.pdf
publishDate: "2018"
publishUrl: https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset
publisher:
- University of Washington
- The Ohio State University
- Fujitsu Laboratories Ltd.
tags:
- Code
taskTypes:
- Visual Question Answering
- Code Search
- Code Summarization
---
# 数据集介绍
## 简介
StaQC(Stack Overflow 问题代码对)是迄今为止最大的数据集,大约有 148K Python 和 120K SQL 域问题代码对,它们是使用 Bi-View Hierarchical Neural Network 从 Stack Overflow 中自动挖掘出来的。
## 引文
```
@inproceedings{yao2018staqc,
title={Staqc: A systematically mined question-code dataset from stack overflow},
author={Yao, Ziyu and Weld, Daniel S and Chen, Wei-Peng and Sun, Huan},
booktitle={Proceedings of the 2018 World Wide Web Conference},
pages={1693--1703},
year={2018}
}
```
## Download dataset
:modelscope-code[]{type="git"}
displayName: StaQC
许可证:
- CC BY 4.0
paperUrl: https://arxiv.org/pdf/1803.09371v1.pdf
publishDate: "2018"
publishUrl: https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset
publisher:
- 华盛顿大学(University of Washington)
- 俄亥俄州立大学(The Ohio State University)
- 富士通实验室有限公司(Fujitsu Laboratories Ltd.)
tags:
- 代码(Code)
taskTypes:
- 视觉问答(Visual Question Answering)
- 代码搜索(Code Search)
- 代码摘要(Code Summarization)
---
# 数据集介绍
## 简介
StaQC(Stack Overflow 问题代码对)是截至目前规模最大的公开数据集之一,包含约14.8万条Python领域与12万条SQL领域的问题-代码对,该数据集通过双视图分层神经网络(Bi-View Hierarchical Neural Network)从Stack Overflow平台自动挖掘获取。
## 引文
@inproceedings{yao2018staqc,
title={Staqc: A systematically mined question-code dataset from stack overflow},
author={Yao, Ziyu and Weld, Daniel S and Chen, Wei-Peng and Sun, Huan},
booktitle={Proceedings of the 2018 World Wide Web Conference},
pages={1693--1703},
year={2018}
}
## 数据集下载
:modelscope-code[]{type="git"}
提供机构:
maas
创建时间:
2024-07-02



