hxue3/autotrain-data-code_summarization
收藏Hugging Face2022-10-23 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/hxue3/autotrain-data-code_summarization
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是AutoTrain为项目code_summarization自动处理的数据集,主要用于条件文本生成任务,语言为英语。数据集包含代码片段及其对应的摘要,字段包括text和target,分别表示代码和摘要。数据集被划分为训练集和验证集,样本数量分别为800和200。
This dataset is automatically processed by AutoTrain for the project `code_summarization`, and is primarily used for conditional text generation tasks, with English as the target language. It contains code snippets and their corresponding summaries, with two fields: `text` and `target`, which represent the code and the summary respectively. The dataset is split into a training set and a validation set, with 800 and 200 samples respectively.
提供机构:
hxue3
原始信息汇总
AutoTrain Dataset for project: code_summarization
数据集描述
该数据集是为项目“code_summarization”自动处理的数据集,使用BCP-47代码标识语言为英语。
数据集结构
数据实例
数据集中的样本示例如下:
json [ { "text": "def read(self, table, columns, keyset, index="", limit=0, partition=None): """Perform a ``St[...]", "target": "Perform a ``StreamingRead`` API request for rows in a table.
:type table: str
:para[...]"
}, { "text": "def maf_somatic_variant_stats(variant, variant_metadata): """ Parse out the variant calling [...]", "target": "Parse out the variant calling statistics for a given variant from a MAF file
Assumes the MAF fo[...]"
} ]
数据集字段
数据集包含以下字段:
json { "text": "Value(dtype=string, id=None)", "target": "Value(dtype=string, id=None)" }
数据集分割
数据集被分割为训练集和验证集,分割大小如下:
| 分割名称 | 样本数量 |
|---|---|
| 训练集 | 800 |
| 验证集 | 200 |



