Nexdata/Traditional_Chinese_Oral_Message_Data
收藏Hugging Face2024-04-17 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Nexdata/Traditional_Chinese_Oral_Message_Data
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- conversational
language:
- zh
---
# Dataset Card for Nexdata/Traditional_Chinese_Oral_Message_Data
## Description
Traditional Chinese SMS corpus, 10 million in total, real traditional Chinese spoken language text data; only contains text messages; the content is stored in txt format; the data set can be used for natural language understanding and related tasks.
For more details, please refer to the link: https://www.nexdata.ai/datasets/182?source=Huggingface
# Specifications
## Data content
Traditional Chinese SMS corpus text data
## Data size
10 million
## Collecting period
The year 2,014
## Storage format
txt
## Language
Chinese
# Licensing Information
Commercial License
提供机构:
Nexdata
原始信息汇总
数据集卡片 for Nexdata/Traditional_Chinese_Oral_Message_Data
描述
传统中文短信语料库,总计1000万条,真实的传统中文口语文本数据;仅包含短信文本;内容以txt格式存储;该数据集可用于自然语言理解和相关任务。
规范
数据内容
传统中文短信语料库文本数据
数据规模
1000万条
收集周期
2014年
存储格式
txt
语言
中文
许可信息
商业许可证



