five

Social Communication Database

收藏
Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/wf5d5b2j52
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset was generated by real world text communication in the group conversation. So, maintain the anonymity of the usernames and the mobile numbers of those who participated in the survey. This dataset can be used for education and research purpose only. The main contribution of this research in text mining is to bring into being a standard dataset for research purpose in the realm of mining the chat conversation. We have observed that this dataset has an immense density to be utilized for research purpose. Our applications based on this dataset, you can utilize this dataset into semantic search, sentiment analysis, semantic clustering of conversation, topic extraction, spam detection, etc. We wish to offer this dataset for others to collaborate and research on further possibilities. We have used our algorithms to extract the textual information from whatsapp logs and stored it in a sqlite database file named as "social conversation.db". This dataset contains 16225 text messages and 839 distinct users. We have considered 17 whatsapp groups for extracting the textual information. Paper: Analysis of foul language usage in social media text conversation Authors: Sumit Kawate and Kailas Patil In the Proceedings of the Int. J. Social Media and Interactive Learning Environments (IJSMILE), Vol: 05, Issue: 03, Pages: 227-251, Inderscience, 201 DOI: https://doi.org/10.1504/IJSMILE.2017.087976 The data is stored in an .zip compressed archives. The uncompressed archive is in 6,020 KB (5.87 MB). Extract with any uncompressed standard software. ## The archive contains the following items: ## DATABASE/ | + Executable/ Directory containing executable files. | | + Social Conversation.db This file contains records of the database in .db format | + Source Code/ Directory containing source code files. | + Social Conversation (csv).csv This file contains records of the database in .csv format + Social Conversation (db).db This file contains records of the database in .db format + Social Conversation (html).html This file contains records of the database in .html format. | + Read Me/ Directory containing read me file. | | read me.txt This file contains detail information about dataset. ## The data format of the dataset are: ## =Table Name= -> CONVERSATION =Atttributes= =Meaning= USER_ID User id of the text message TEXT_MSG TextActual text message CONTACT_NUMBER Contact number of the user (We have masked the few digits of contact number of the user) DATE Date of the text message TIME Time of the text message =Atttributes= =Format= USER_ID User Id TEXT_MSG (text messsage in any format) CONTACT_NUMBER +contactnumber DATE dd/mm/yy TIME hh:mm AM or hh:mm PM =Atttributes= =Sample Example= USER_ID User 514 TEXT_MSG Any deal on formal shoes with prime CONTACT_NUMBER +919xxxxx927 DATE 04/12/17 TIME 1:35 AM --------------------------------------------------
创建时间:
2019-12-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作