legacy-datasets/banking77
收藏数据集概述
数据集描述
数据集摘要
BANKING77数据集包含在线银行查询的标注数据,涵盖77种细粒度的意图分类。该数据集包含13,083个客户服务查询,专注于细粒度的单一领域意图检测。
支持的任务和排行榜
- 意图分类
- 意图检测
语言
英语
数据集结构
数据实例
训练集的一个示例如下: json { "label": 11, # 对应"card_arrival"意图 "text": "I am still waiting on my card?" }
数据字段
text: 字符串特征。label: 分类标签(0-76),对应唯一的意图。
意图名称与标签的映射如下:
| label | intent (category) |
|---|---|
| 0 | activate_my_card |
| 1 | age_limit |
| 2 | apple_pay_or_google_pay |
| 3 | atm_support |
| 4 | automatic_top_up |
| 5 | balance_not_updated_after_bank_transfer |
| 6 | balance_not_updated_after_cheque_or_cash_deposit |
| 7 | beneficiary_not_allowed |
| 8 | cancel_transfer |
| 9 | card_about_to_expire |
| 10 | card_acceptance |
| 11 | card_arrival |
| 12 | card_delivery_estimate |
| 13 | card_linking |
| 14 | card_not_working |
| 15 | card_payment_fee_charged |
| 16 | card_payment_not_recognised |
| 17 | card_payment_wrong_exchange_rate |
| 18 | card_swallowed |
| 19 | cash_withdrawal_charge |
| 20 | cash_withdrawal_not_recognised |
| 21 | change_pin |
| 22 | compromised_card |
| 23 | contactless_not_working |
| 24 | country_support |
| 25 | declined_card_payment |
| 26 | declined_cash_withdrawal |
| 27 | declined_transfer |
| 28 | direct_debit_payment_not_recognised |
| 29 | disposable_card_limits |
| 30 | edit_personal_details |
| 31 | exchange_charge |
| 32 | exchange_rate |
| 33 | exchange_via_app |
| 34 | extra_charge_on_statement |
| 35 | failed_transfer |
| 36 | fiat_currency_support |
| 37 | get_disposable_virtual_card |
| 38 | get_physical_card |
| 39 | getting_spare_card |
| 40 | getting_virtual_card |
| 41 | lost_or_stolen_card |
| 42 | lost_or_stolen_phone |
| 43 | order_physical_card |
| 44 | passcode_forgotten |
| 45 | pending_card_payment |
| 46 | pending_cash_withdrawal |
| 47 | pending_top_up |
| 48 | pending_transfer |
| 49 | pin_blocked |
| 50 | receiving_money |
| 51 | Refund_not_showing_up |
| 52 | request_refund |
| 53 | reverted_card_payment? |
| 54 | supported_cards_and_currencies |
| 55 | terminate_account |
| 56 | top_up_by_bank_transfer_charge |
| 57 | top_up_by_card_charge |
| 58 | top_up_by_cash_or_cheque |
| 59 | top_up_failed |
| 60 | top_up_limits |
| 61 | top_up_reverted |
| 62 | topping_up_by_card |
| 63 | transaction_charged_twice |
| 64 | transfer_fee_charged |
| 65 | transfer_into_account |
| 66 | transfer_not_received_by_recipient |
| 67 | transfer_timing |
| 68 | unable_to_verify_identity |
| 69 | verify_my_identity |
| 70 | verify_source_of_funds |
| 71 | verify_top_up |
| 72 | virtual_card_not_working |
| 73 | visa_or_mastercard |
| 74 | why_verify_identity |
| 75 | wrong_amount_of_cash_received |
| 76 | wrong_exchange_rate_for_cash_withdrawal |
数据分割
| Dataset statistics | Train | Test |
|---|---|---|
| Number of examples | 10,003 | 3,080 |
| Average character length | 59.5 | 54.2 |
| Number of intents | 77 | 77 |
| Number of domains | 1 | 1 |
数据集创建
策划理由
BANKING77数据集旨在填补现有意图检测数据集的空白,提供一个细粒度的单一领域(银行)意图检测数据集。与多领域数据集相比,该数据集更能捕捉单一领域的复杂性。
源数据
初始数据收集和规范化
[更多信息需要]
源语言生产者
[更多信息需要]
标注
标注过程
数据集不包含额外的标注。
标注者
[不适用]
个人和敏感信息
[不适用]
使用数据的注意事项
数据集的社会影响
该数据集旨在帮助开发更好的意图检测系统,任何全面的意图检测评估应同时涉及粗粒度多领域数据集和细粒度单一领域数据集,如BANKING77。
偏见讨论
[更多信息需要]
其他已知限制
[更多信息需要]
附加信息
数据集策展人
许可信息
Creative Commons Attribution 4.0 International
引用信息
@inproceedings{Casanueva2020, author = {I{~{n}}igo Casanueva and Tadas Temcinas and Daniela Gerz and Matthew Henderson and Ivan Vulic}, title = {Efficient Intent Detection with Dual Sentence Encoders}, year = {2020}, month = {mar}, note = {Data available at https://github.com/PolyAI-LDN/task-specific-datasets}, url = {https://arxiv.org/abs/2003.04807}, booktitle = {Proceedings of the 2nd Workshop on NLP for ConvAI - ACL 2020} }
贡献
感谢@dkajtoch添加此数据集。




