FVnC和Sample数据集
收藏arXiv2022-01-03 更新2024-06-21 收录
下载链接:
https://github.com/trieuntu/conversation_clustering
下载链接
链接失效反馈官方服务:
资源简介:
FVnC和Sample数据集是由芽庄大学信息技术学院的研究团队从Facebook Messenger页面收集的越南语对话构建而成。FVnC数据集包含44,846条经过预处理的文本句子,而Sample数据集则是从FVnC中随机选取的95条样本。这些数据集主要用于训练聊天机器人,特别是通过BERT模型进行特征提取和聚类分析,以优化聊天机器人的对话理解和响应能力。数据集的应用领域主要集中在自然语言处理和人工智能的聊天机器人开发,旨在通过实际对话数据的分析和处理,提高聊天机器人的智能交互和问题解决能力。
The FVnC and Sample datasets are Vietnamese conversational datasets constructed by a research team from the Faculty of Information Technology, Nha Trang University, which were collected from Facebook Messenger pages. The FVnC dataset contains 44,846 preprocessed text sentences, while the Sample dataset is a random subset of 95 samples selected from FVnC. These datasets are primarily used for training chatbots, specifically for feature extraction and cluster analysis via BERT models to optimize the chatbot's dialogue understanding and response capabilities. Their application fields mainly focus on natural language processing (NLP) and AI chatbot development, aiming to improve the intelligent interaction and problem-solving abilities of chatbots through the analysis and processing of real conversational data.
提供机构:
信息技术学院,芽庄大学
创建时间:
2021-12-31



