five

ke07han/Bitext-customer-support-llm-chatbot-training-dataset

收藏
Hugging Face2026-04-16 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/ke07han/Bitext-customer-support-llm-chatbot-training-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cdla-sharing-1.0 task_categories: - question-answering - table-question-answering language: - en tags: - question-answering - llm - chatbot - customer-support - conversional-ai - generative-ai - natural-language-understanding - fine-tuning - Retail pretty_name: >- Bitext - Customer Service Tagged Training Dataset for LLM-based Virtual Assistants size_categories: - 10K<n<100K --- # Bitext - Customer Service Tagged Training Dataset for LLM-based Virtual Assistants ## Overview This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the Customer Support sector can be easily achieved using our two-step approach to LLM Fine-Tuning. For example, if you are [ACME Company], you can create your own customized LLM by first training a fine-tuned model using this dataset, and then further fine-tuning it with a small amount of your own data. An overview of this approach can be found at: [From General-Purpose LLMs to Verticalized Enterprise Models](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/) The dataset has the following specs: - Use Case: Intent Detection - Vertical: Customer Service - 27 intents assigned to 10 categories - 26872 question/answer pairs, around 1000 per intent - 30 entity/slot types - 12 different types of language generation tags The categories and intents have been selected from Bitext's collection of 20 vertical-specific datasets, covering the intents that are common across all 20 verticals. The verticals are: - Automotive, Retail Banking, Education, Events & Ticketing, Field Services, Healthcare, Hospitality, Insurance, Legal Services, Manufacturing, Media Streaming, Mortgages & Loans, Moving & Storage, Real Estate/Construction, Restaurant & Bar Chains, Retail/E-commerce, Telecommunications, Travel, Utilities, Wealth Management For a full list of verticals and its intents see [https://www.bitext.com/chatbot-verticals/](https://www.bitext.com/chatbot-verticals/). The question/answer pairs have been generated using a hybrid methodology that uses natural texts as source text, NLP technology to extract seeds from these texts, and NLG technology to expand the seed texts. All steps in the process are curated by computational linguists. ## Dataset Token Count The dataset contains an extensive amount of text data across its 'instruction' and 'response' columns. After processing and tokenizing the dataset, we've identified a total of 3.57 million tokens. This rich set of tokens is essential for training advanced LLMs for AI Conversational, AI Generative, and Question and Answering (Q&A) models. ## Fields of the Dataset Each entry in the dataset contains the following fields: - flags: tags (explained below in the Language Generation Tags section) - instruction: a user request from the Customer Service domain - category: the high-level semantic category for the intent - intent: the intent corresponding to the user instruction - response: an example expected response from the virtual assistant ## Categories and Intents The categories and intents covered by the dataset are: - ACCOUNT: create_account, delete_account, edit_account, switch_account - CANCELLATION_FEE: check_cancellation_fee - DELIVERY: delivery_options - FEEDBACK: complaint, review - INVOICE: check_invoice, get_invoice - NEWSLETTER: newsletter_subscription - ORDER: cancel_order, change_order, place_order - PAYMENT: check_payment_methods, payment_issue - REFUND: check_refund_policy, track_refund - SHIPPING_ADDRESS: change_shipping_address, set_up_shipping_address ## Entities The entities covered by the dataset are: - {{Order Number}}, typically present in: - Intents: cancel_order, change_order, change_shipping_address, check_invoice, check_refund_policy, complaint, delivery_options, delivery_period, get_invoice, get_refund, place_order, track_order, track_refund - {{Invoice Number}}, typically present in: - Intents: check_invoice, get_invoice - {{Online Order Interaction}}, typically present in: - Intents: cancel_order, change_order, check_refund_policy, delivery_period, get_refund, review, track_order, track_refund - {{Online Payment Interaction}}, typically present in: - Intents: cancel_order, check_payment_methods - {{Online Navigation Step}}, typically present in: - Intents: complaint, delivery_options - {{Online Customer Support Channel}}, typically present in: - Intents: check_refund_policy, complaint, contact_human_agent, delete_account, delivery_options, edit_account, get_refund, payment_issue, registration_problems, switch_account - {{Profile}}, typically present in: - Intent: switch_account - {{Profile Type}}, typically present in: - Intent: switch_account - {{Settings}}, typically present in: - Intents: cancel_order, change_order, change_shipping_address, check_cancellation_fee, check_invoice, check_payment_methods, contact_human_agent, delete_account, delivery_options, edit_account, get_invoice, newsletter_subscription, payment_issue, place_order, recover_password, registration_problems, set_up_shipping_address, switch_account, track_order, track_refund - {{Online Company Portal Info}}, typically present in: - Intents: cancel_order, edit_account - {{Date}}, typically present in: - Intents: check_invoice, check_refund_policy, get_refund, track_order, track_refund - {{Date Range}}, typically present in: - Intents: check_cancellation_fee, check_invoice, get_invoice - {{Shipping Cut-off Time}}, typically present in: - Intent: delivery_options - {{Delivery City}}, typically present in: - Intent: delivery_options - {{Delivery Country}}, typically present in: - Intents: check_payment_methods, check_refund_policy, delivery_options, review, switch_account - {{Salutation}}, typically present in: - Intents: cancel_order, check_payment_methods, check_refund_policy, create_account, delete_account, delivery_options, get_refund, recover_password, review, set_up_shipping_address, switch_account, track_refund - {{Client First Name}}, typically present in: - Intents: check_invoice, get_invoice - {{Client Last Name}}, typically present in: - Intents: check_invoice, create_account, get_invoice - {{Customer Support Phone Number}}, typically present in: - Intents: change_shipping_address, contact_customer_service, contact_human_agent, payment_issue - {{Customer Support Email}}, typically present in: - Intents: cancel_order, change_shipping_address, check_invoice, check_refund_policy, complaint, contact_customer_service, contact_human_agent, get_invoice, get_refund, newsletter_subscription, payment_issue, recover_password, registration_problems, review, set_up_shipping_address, switch_account - {{Live Chat Support}}, typically present in: - Intents: check_refund_policy, complaint, contact_human_agent, delete_account, delivery_options, edit_account, get_refund, payment_issue, recover_password, registration_problems, review, set_up_shipping_address, switch_account, track_order - {{Website URL}}, typically present in: - Intents: check_payment_methods, check_refund_policy, complaint, contact_customer_service, contact_human_agent, create_account, delete_account, delivery_options, get_refund, newsletter_subscription, payment_issue, place_order, recover_password, registration_problems, review, switch_account - {{Upgrade Account}}, typically present in: - Intents: create_account, edit_account, switch_account - {{Account Type}}, typically present in: - Intents: cancel_order, change_order, change_shipping_address, check_cancellation_fee, check_invoice, check_payment_methods, check_refund_policy, complaint, contact_customer_service, contact_human_agent, create_account, delete_account, delivery_options, delivery_period, edit_account, get_invoice, get_refund, newsletter_subscription, payment_issue, place_order, recover_password, registration_problems, review, set_up_shipping_address, switch_account, track_order, track_refund - {{Account Category}}, typically present in: - Intents: cancel_order, change_order, change_shipping_address, check_cancellation_fee, check_invoice, check_payment_methods, check_refund_policy, complaint, contact_customer_service, contact_human_agent, create_account, delete_account, delivery_options, delivery_period, edit_account, get_invoice, get_refund, newsletter_subscription, payment_issue, place_order, recover_password, registration_problems, review, set_up_shipping_address, switch_account, track_order, track_refund - {{Account Change}}, typically present in: - Intent: switch_account - {{Program}}, typically present in: - Intent: place_order - {{Refund Amount}}, typically present in: - Intent: track_refund - {{Money Amount}}, typically present in: - Intents: check_refund_policy, complaint, get_refund, track_refund - {{Store Location}}, typically present in: - Intents: complaint, delivery_options, place_order ## Language Generation Tags The dataset contains tags that reflect how language varies/changes across different linguistic phenomena like colloquial or offensive language. So if an utterance for intent “cancel_order” contains the “COLLOQUIAL” tag, the utterance will express an informal language variation like: “can u cancel my order”. These tags indicate the type of language variation that the entry expresses. When associated to each entry, they allow Conversational Designers to customize training datasets to different user profiles with different uses of language. Through these tags, many different datasets can be created to make the resulting assistant more accurate and robust. A bot that sells sneakers should be mainly targeted to younger population that use a more colloquial language; while a classical retail banking bot should be able to handle more formal or polite language. The dataset also reflects commonly occurring linguistic phenomena of real-life virtual assistant, such as spelling mistakes, run-on words, punctuation errors… The dataset contains tagging for all relevant linguistic phenomena that can be used to customize the dataset for different user profiles. ### Tags for Lexical variation M - Morphological variation: inflectional and derivational “is my SIM card active”, “is my SIM card activated” L - Semantic variations: synonyms, use of hyphens, compounding… “what’s my billing date", “what’s my anniversary date” ### Tags for Syntactic structure variation B - Basic syntactic structure: “activate my SIM card”, “I need to activate my SIM card” I - Interrogative structure “can you activate my SIM card?”, “how do I activate my SIM card?” C - Coordinated syntactic structure “I have a new SIM card, what do I need to do to activate it?” N - Negation “I do not want this item, where to cancel my order?” ### Tags for language register variations P - Politeness variation “could you help me activate my SIM card, please?” Q - Colloquial variation “can u activ8 my SIM?” W - Offensive language “I want to talk to a f*&%*g agent” ### Tags for stylistic variations K - Keyword mode "activate SIM", "new SIM" E - Use of abbreviations: “I'm / I am interested in getting a new SIM” Z - Errors and Typos: spelling issues, wrong punctuation… “how can i activaet my card” ### Other tags not in use in this Dataset D - Indirect speech “ask my agent to activate my SIM card” G - Regional variations US English vs UK English: "truck" vs "lorry" France French vs Canadian French: "tchatter" vs "clavarder" R - Respect structures - Language-dependent variations English: "may" vs "can…" French: "tu" vs "vous..." Spanish: "tú" vs "usted..." Y - Code switching “activer ma SIM card” --- (c) Bitext Innovations, 2024

许可证:cdla-sharing-1.0 任务类别: - 问答 - 表格问答 语言: - 英语 标签: - 问答 - 大语言模型(LLM/Large Language Model) - 聊天机器人 - 客户支持 - 对话式AI(Conversational AI) - 生成式AI(Generative AI) - 自然语言理解 - 微调 - 零售 漂亮名称:Bitext——面向基于大语言模型的虚拟助手的客户服务标注训练数据集 规模类别:10K<n<100K --- # Bitext——面向基于大语言模型的虚拟助手的客户服务标注训练数据集 ## 概述 本数据集为混合合成数据集,旨在用于微调大语言模型(Large Language Model,LLM),如GPT、Mistral及OpenELM,由我方自然语言处理(Natural Language Processing, NLP)/自然语言生成(Natural Language Generation, NLG)技术与自动化数据标注(Data Labeling, DAL)工具生成。其核心目标是展示如何通过两步大语言模型微调方法,轻松实现客户服务领域的垂直化/领域适配。例如,若您为[ACME公司],可先使用本数据集训练微调模型,再结合少量自有数据进一步微调,从而打造专属定制化大语言模型。该方法的概述可参阅:[从通用大语言模型到垂直化企业模型](https://www.bitext.com/blog/general-purpose-models-verticalized-enterprise-genai/) 本数据集规格如下: - 用例:意图识别 - 领域:客户服务 - 27个意图归属10个类别 - 26872组问答对,每个意图约含1000条数据 - 30种实体/槽位类型 - 12种语言生成标签 上述类别与意图选自Bitext的20个垂直领域数据集合集,覆盖全部20个垂直领域的常见通用意图。涉及的垂直领域包括: - 汽车、零售银行、教育、活动与票务、现场服务、医疗保健、酒店业、保险、法律服务、制造业、流媒体媒体、抵押贷款与贷款、搬家与仓储、房地产/建筑、餐饮连锁、零售/电子商务、电信、旅游、公用事业、财富管理 完整的垂直领域及对应意图列表可参阅[https://www.bitext.com/chatbot-verticals/](https://www.bitext.com/chatbot-verticals/)。 问答对采用混合方法生成:以自然文本作为源文本,通过自然语言处理技术提取种子文本,再借助自然语言生成技术扩展种子文本。流程所有步骤均由计算语言学家审核把关。 ## 数据集词元(Token)统计 本数据集的`instruction`(指令)与`response`(回复)列包含大量文本数据。经处理与词元化后,总计包含357万个词元(Token)。这些丰富的词元集对于训练用于AI对话、AI生成及问答(Q&A)模型的先进大语言模型至关重要。 ## 数据集字段 数据集中的每条条目均包含以下字段: - flags:标签(详见下文“语言生成标签”部分) - instruction:客户服务领域的用户请求 - category:意图对应的高级语义类别 - intent:与用户请求对应的意图 - response:虚拟助手的预期回复示例 ## 类别与意图 本数据集覆盖的类别与意图如下: - ACCOUNT(账户):create_account(创建账户)、delete_account(删除账户)、edit_account(编辑账户)、switch_account(切换账户) - CANCELLATION_FEE(取消费用):check_cancellation_fee(查询取消费用) - DELIVERY(配送):delivery_options(配送选项) - FEEDBACK(反馈):complaint(投诉)、review(评价) - INVOICE(发票):check_invoice(查询发票)、get_invoice(获取发票) - NEWSLETTER(通讯订阅):newsletter_subscription(订阅通讯) - ORDER(订单):cancel_order(取消订单)、change_order(修改订单)、place_order(下单) - PAYMENT(支付):check_payment_methods(查询支付方式)、payment_issue(支付问题) - REFUND(退款):check_refund_policy(查询退款政策)、track_refund(追踪退款) - SHIPPING_ADDRESS(收货地址):change_shipping_address(修改收货地址)、set_up_shipping_address(设置收货地址) ## 实体 本数据集覆盖的实体包括: - {{订单编号}},通常出现于以下意图:cancel_order(取消订单)、change_order(修改订单)、change_shipping_address(修改收货地址)、check_invoice(查询发票)、check_refund_policy(查询退款政策)、complaint(投诉)、delivery_options(配送选项)、delivery_period(配送周期)、get_invoice(获取发票)、get_refund(获取退款)、place_order(下单)、track_order(追踪订单)、track_refund(追踪退款) - {{发票编号}},通常出现于以下意图:check_invoice(查询发票)、get_invoice(获取发票) - {{在线订单交互}},通常出现于以下意图:cancel_order(取消订单)、change_order(修改订单)、check_refund_policy(查询退款政策)、delivery_period(配送周期)、get_refund(获取退款)、review(评价)、track_order(追踪订单)、track_refund(追踪退款) - {{在线支付交互}},通常出现于以下意图:cancel_order(取消订单)、check_payment_methods(查询支付方式) - {{在线导航步骤}},通常出现于以下意图:complaint(投诉)、delivery_options(配送选项) - {{在线客服渠道}},通常出现于以下意图:check_refund_policy(查询退款政策)、complaint(投诉)、contact_human_agent(联系人工客服)、delete_account(删除账户)、delivery_options(配送选项)、edit_account(编辑账户)、get_refund(获取退款)、payment_issue(支付问题)、registration_problems(注册问题)、switch_account(切换账户) - {{个人档案}},通常出现于以下意图:switch_account(切换账户) - {{档案类型}},通常出现于以下意图:switch_account(切换账户) - {{设置项}},通常出现于以下意图:cancel_order(取消订单)、change_order(修改订单)、change_shipping_address(修改收货地址)、check_cancellation_fee(查询取消费用)、check_invoice(查询发票)、check_payment_methods(查询支付方式)、contact_human_agent(联系人工客服)、delete_account(删除账户)、delivery_options(配送选项)、edit_account(编辑账户)、get_invoice(获取发票)、newsletter_subscription(订阅通讯)、payment_issue(支付问题)、place_order(下单)、recover_password(找回密码)、registration_problems(注册问题)、set_up_shipping_address(设置收货地址)、switch_account(切换账户)、track_order(追踪订单)、track_refund(追踪退款) - {{企业在线门户信息}},通常出现于以下意图:cancel_order(取消订单)、edit_account(编辑账户) - {{日期}},通常出现于以下意图:check_invoice(查询发票)、check_refund_policy(查询退款政策)、get_refund(获取退款)、track_order(追踪订单)、track_refund(追踪退款) - {{日期范围}},通常出现于以下意图:check_cancellation_fee(查询取消费用)、check_invoice(查询发票)、get_invoice(获取发票) - {{配送截止时间}},通常出现于以下意图:delivery_options(配送选项) - {{配送城市}},通常出现于以下意图:delivery_options(配送选项) - {{配送国家}},通常出现于以下意图:check_payment_methods(查询支付方式)、check_refund_policy(查询退款政策)、delivery_options(配送选项)、review(评价)、switch_account(切换账户) - {{称呼}},通常出现于以下意图:cancel_order(取消订单)、check_payment_methods(查询支付方式)、check_refund_policy(查询退款政策)、create_account(创建账户)、delete_account(删除账户)、delivery_options(配送选项)、get_refund(获取退款)、recover_password(找回密码)、review(评价)、set_up_shipping_address(设置收货地址)、switch_account(切换账户)、track_refund(追踪退款) - {{客户名}},通常出现于以下意图:check_invoice(查询发票)、get_invoice(获取发票) - {{客户姓氏}},通常出现于以下意图:check_invoice(查询发票)、create_account(创建账户)、get_invoice(获取发票) - {{客服电话号码}},通常出现于以下意图:change_shipping_address(修改收货地址)、contact_customer_service(联系客服)、contact_human_agent(联系人工客服)、payment_issue(支付问题) - {{客服电子邮箱}},通常出现于以下意图:cancel_order(取消订单)、change_shipping_address(修改收货地址)、check_invoice(查询发票)、check_refund_policy(查询退款政策)、complaint(投诉)、contact_customer_service(联系客服)、contact_human_agent(联系人工客服)、get_invoice(获取发票)、get_refund(获取退款)、newsletter_subscription(订阅通讯)、payment_issue(支付问题)、recover_password(找回密码)、registration_problems(注册问题)、review(评价)、set_up_shipping_address(设置收货地址)、switch_account(切换账户) - {{在线聊天客服}},通常出现于以下意图:check_refund_policy(查询退款政策)、complaint(投诉)、contact_human_agent(联系人工客服)、delete_account(删除账户)、delivery_options(配送选项)、edit_account(编辑账户)、get_refund(获取退款)、payment_issue(支付问题)、recover_password(找回密码)、registration_problems(注册问题)、review(评价)、set_up_shipping_address(设置收货地址)、switch_account(切换账户)、track_order(追踪订单) - {{网站URL}},通常出现于以下意图:check_payment_methods(查询支付方式)、check_refund_policy(查询退款政策)、complaint(投诉)、contact_customer_service(联系客服)、contact_human_agent(联系人工客服)、create_account(创建账户)、delete_account(删除账户)、delivery_options(配送选项)、get_refund(获取退款)、newsletter_subscription(订阅通讯)、payment_issue(支付问题)、place_order(下单)、recover_password(找回密码)、registration_problems(注册问题)、review(评价)、switch_account(切换账户) - {{账户升级}},通常出现于以下意图:create_account(创建账户)、edit_account(编辑账户)、switch_account(切换账户) - {{账户类型}},通常出现于以下意图:cancel_order(取消订单)、change_order(修改订单)、change_shipping_address(修改收货地址)、check_cancellation_fee(查询取消费用)、check_invoice(查询发票)、check_payment_methods(查询支付方式)、check_refund_policy(查询退款政策)、complaint(投诉)、contact_customer_service(联系客服)、contact_human_agent(联系人工客服)、create_account(创建账户)、delete_account(删除账户)、delivery_options(配送选项)、delivery_period(配送周期)、edit_account(编辑账户)、get_invoice(获取发票)、get_refund(获取退款)、newsletter_subscription(订阅通讯)、payment_issue(支付问题)、place_order(下单)、recover_password(找回密码)、registration_problems(注册问题)、review(评价)、set_up_shipping_address(设置收货地址)、switch_account(切换账户)、track_order(追踪订单)、track_refund(追踪退款) - {{账户类别}},通常出现于以下意图:cancel_order(取消订单)、change_order(修改订单)、change_shipping_address(修改收货地址)、check_cancellation_fee(查询取消费用)、check_invoice(查询发票)、check_payment_methods(查询支付方式)、check_refund_policy(查询退款政策)、complaint(投诉)、contact_customer_service(联系客服)、contact_human_agent(联系人工客服)、create_account(创建账户)、delete_account(删除账户)、delivery_options(配送选项)、delivery_period(配送周期)、edit_account(编辑账户)、get_invoice(获取发票)、get_refund(获取退款)、newsletter_subscription(订阅通讯)、payment_issue(支付问题)、place_order(下单)、recover_password(找回密码)、registration_problems(注册问题)、review(评价)、set_up_shipping_address(设置收货地址)、switch_account(切换账户)、track_order(追踪订单)、track_refund(追踪退款) - {{账户变更}},通常出现于以下意图:switch_account(切换账户) - {{活动方案}},通常出现于以下意图:place_order(下单) - {{退款金额}},通常出现于以下意图:track_refund(追踪退款) - {{金额}},通常出现于以下意图:check_refund_policy(查询退款政策)、complaint(投诉)、get_refund(获取退款)、track_refund(追踪退款) - {{门店地址}},通常出现于以下意图:complaint(投诉)、delivery_options(配送选项)、place_order(下单) ## 语言生成标签 本数据集包含的标签可反映不同语言现象下的语言变化,例如口语化表达或冒犯性语言。若意图为“cancel_order(取消订单)”的条目带有“COLLOQUIAL”标签,则该条目的表达将采用非正式语言变体,例如:“can u cancel my order”(能帮我取消订单吗)。 这些标签用于标识条目所体现的语言变体类型。为每条条目关联标签后,对话设计师可针对不同语言使用习惯的用户群体定制训练数据集。通过这些标签,可生成多种不同的数据集,从而让最终的虚拟助手更精准、更健壮。例如,销售运动鞋的聊天机器人主要面向使用口语化表达的年轻群体;而传统零售银行的聊天机器人则需能够处理更正式、礼貌的语言。本数据集还涵盖了现实虚拟助手常见的语言现象,如拼写错误、连写单词、标点错误等。 本数据集针对所有可用于定制不同用户群体数据集的相关语言现象进行了标注。 ### 词汇变体标签 M - 形态变体:屈折变化与派生变化 示例:“is my SIM card active”(我的SIM卡激活了吗)、“is my SIM card activated”(我的SIM卡是否已激活) L - 语义变体:同义词、连字符使用、复合词等 示例:“what’s my billing date”(我的账单日期是什么)、“what’s my anniversary date”(我的结算纪念日是什么) ### 句法结构变体标签 B - 基础句法结构: 示例:“activate my SIM card”(激活我的SIM卡)、“I need to activate my SIM card”(我需要激活我的SIM卡) I - 疑问结构: 示例:“can you activate my SIM card?”(你能帮我激活SIM卡吗?)、“how do I activate my SIM card?”(我该如何激活SIM卡?) C - 并列句法结构: 示例:“I have a new SIM card, what do I need to do to activate it?”(我有一张新SIM卡,激活它需要做什么?) N - 否定结构: 示例:“I do not want this item, where to cancel my order?”(我不想要这个商品,该如何取消订单?) ### 语域变体标签 P - 礼貌变体: 示例:“could you help me activate my SIM card, please?”(麻烦你帮我激活SIM卡好吗?) Q - 口语变体: 示例:“can u activ8 my SIM?”(能帮我激活SIM卡不?) W - 冒犯性语言: 示例:“I want to talk to a f*&%*g agent”(我想找个该死的客服) ### 文体变体标签 K - 关键词模式: 示例:“activate SIM”(激活SIM卡)、“new SIM”(新SIM卡) E - 缩写使用: 示例:“I'm / I am interested in getting a new SIM”(我有兴趣办理一张新SIM卡) Z - 错误与拼写错误:拼写问题、标点错误等 示例:“how can i activaet my card”(我该如何激活我的卡?) ### 本数据集未使用的其他标签 D - 间接引语: 示例:“ask my agent to activate my SIM card”(让我的客服帮我激活SIM卡) G - 地域变体: 美式英语与英式英语:“truck”(卡车) vs “lorry”(卡车) 法国法语与加拿大法语:“tchatter”(聊天) vs “clavarder”(聊天) R - 敬语结构:语言依赖型变体 英语:“may” vs “can…”(均可表示请求许可) 法语:“tu”(你,非正式) vs “vous”(您,正式) 西班牙语:“tú”(你,非正式) vs “usted”(您,正式) Y - 代码切换: 示例:“activer ma SIM card”(激活我的SIM卡,混合法语与英语) --- © Bitext Innovations, 2024
提供机构:
ke07han
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作