neil-code/autotrain-data-summarization
收藏AutoTrain 项目总结数据集
数据集描述
该数据集由 AutoTrain 自动处理,用于项目总结。
语言
数据集的语言为英语,BCP-47 代码为 en。
数据集结构
数据实例
数据集中的一个样本如下所示:
json [ { "feat_id": "train_0", "text": "#Person1#: Hi, Mr. Smith. Im Doctor Hawkins. Why are you here today? #Person2#: I found it would be a good idea to get a check-up. #Person1#: Yes, well, you havent had one for 5 years. You should have one every year. #Person2#: I know. I figure as long as there is nothing wrong, why go see the doctor? #Person1#: Well, the best way to avoid serious illnesses is to find out about them early. So try to come at least once a year for your own good. #Person2#: Ok. #Person1#: Let me see here. Your eyes and ears look fine. Take a deep breath, please. Do you smoke, Mr. Smith? #Person2#: Yes. #Person1#: Smoking is the leading cause of lung cancer and heart disease, you know. You really should quit. #Person2#: Ive tried hundreds of times, but I just cant seem to kick the habit. #Person1#: Well, we have classes and some medications that might help. Ill give you more information before you leave. #Person2#: Ok, thanks doctor.", "target": "Mr. Smiths getting a check-up, and Doctor Hawkins advises him to have one every year. Hawkinsll give some information about their classes and medications to help Mr. Smith quit smoking.", "feat_topic": "get a check-up" }, { "feat_id": "train_1", "text": "#Person1#: Hello Mrs. Parker, how have you been? #Person2#: Hello Dr. Peters. Just fine thank you. Ricky and I are here for his vaccines. #Person1#: Very well. Lets see, according to his vaccination record, Ricky has received his Polio, Tetanus and Hepatitis B shots. He is 14 months old, so he is due for Hepatitis A, Chickenpox and Measles shots. #Person2#: What about Rubella and Mumps? #Person1#: Well, I can only give him these for now, and after a couple of weeks I can administer the rest. #Person2#: OK, great. Doctor, I think I also may need a Tetanus booster. Last time I got it was maybe fifteen years ago! #Person1#: We will check our records and Ill have the nurse administer and the booster as well. Now, please hold Rickys arm tight, this may sting a little.", "target": "Mrs Parker takes Ricky for his vaccines. Dr. Peters checks the record and then gives Ricky a vaccine.", "feat_topic": "vaccines" } ]
数据集字段
数据集包含以下字段(也称为“特征”):
json { "feat_id": "Value(dtype=string, id=None)", "text": "Value(dtype=string, id=None)", "target": "Value(dtype=string, id=None)", "feat_topic": "Value(dtype=string, id=None)" }
数据集分割
该数据集分为训练集和验证集,分割大小如下:
| 分割名称 | 样本数量 |
|---|---|
| train | 1999 |
| valid | 499 |



