five

neil-code/autotrain-data-summarization

收藏
Hugging Face2023-08-24 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/neil-code/autotrain-data-summarization
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集是一个用于摘要生成任务的数据集,由AutoTrain自动处理。数据集的语言为英语,每个实例包含四个字段:feat_id(实例的唯一标识)、text(输入的文本)、target(摘要目标文本)、feat_topic(文本的主题)。数据集分为训练集和验证集,训练集包含1999个样本,验证集包含499个样本。
提供机构:
neil-code
原始信息汇总

AutoTrain 项目总结数据集

数据集描述

该数据集由 AutoTrain 自动处理,用于项目总结。

语言

数据集的语言为英语,BCP-47 代码为 en

数据集结构

数据实例

数据集中的一个样本如下所示:

json [ { "feat_id": "train_0", "text": "#Person1#: Hi, Mr. Smith. Im Doctor Hawkins. Why are you here today? #Person2#: I found it would be a good idea to get a check-up. #Person1#: Yes, well, you havent had one for 5 years. You should have one every year. #Person2#: I know. I figure as long as there is nothing wrong, why go see the doctor? #Person1#: Well, the best way to avoid serious illnesses is to find out about them early. So try to come at least once a year for your own good. #Person2#: Ok. #Person1#: Let me see here. Your eyes and ears look fine. Take a deep breath, please. Do you smoke, Mr. Smith? #Person2#: Yes. #Person1#: Smoking is the leading cause of lung cancer and heart disease, you know. You really should quit. #Person2#: Ive tried hundreds of times, but I just cant seem to kick the habit. #Person1#: Well, we have classes and some medications that might help. Ill give you more information before you leave. #Person2#: Ok, thanks doctor.", "target": "Mr. Smiths getting a check-up, and Doctor Hawkins advises him to have one every year. Hawkinsll give some information about their classes and medications to help Mr. Smith quit smoking.", "feat_topic": "get a check-up" }, { "feat_id": "train_1", "text": "#Person1#: Hello Mrs. Parker, how have you been? #Person2#: Hello Dr. Peters. Just fine thank you. Ricky and I are here for his vaccines. #Person1#: Very well. Lets see, according to his vaccination record, Ricky has received his Polio, Tetanus and Hepatitis B shots. He is 14 months old, so he is due for Hepatitis A, Chickenpox and Measles shots. #Person2#: What about Rubella and Mumps? #Person1#: Well, I can only give him these for now, and after a couple of weeks I can administer the rest. #Person2#: OK, great. Doctor, I think I also may need a Tetanus booster. Last time I got it was maybe fifteen years ago! #Person1#: We will check our records and Ill have the nurse administer and the booster as well. Now, please hold Rickys arm tight, this may sting a little.", "target": "Mrs Parker takes Ricky for his vaccines. Dr. Peters checks the record and then gives Ricky a vaccine.", "feat_topic": "vaccines" } ]

数据集字段

数据集包含以下字段(也称为“特征”):

json { "feat_id": "Value(dtype=string, id=None)", "text": "Value(dtype=string, id=None)", "target": "Value(dtype=string, id=None)", "feat_topic": "Value(dtype=string, id=None)" }

数据集分割

该数据集分为训练集和验证集,分割大小如下:

分割名称 样本数量
train 1999
valid 499
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作