Social Communication Database
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/wf5d5b2j52
下载链接
链接失效反馈官方服务:
资源简介:
The dataset was generated by real world text communication in the group conversation. So, maintain the anonymity of the usernames and the mobile numbers of those who participated in the survey. This dataset can be used for education and research purpose only.
The main contribution of this research in text mining is to bring into being a standard dataset for research purpose
in the realm of mining the chat conversation. We have observed that this dataset has an immense density to be utilized
for research purpose. Our applications based on this dataset, you can utilize this dataset into semantic search,
sentiment analysis, semantic clustering of conversation, topic extraction, spam detection, etc. We wish to offer
this dataset for others to collaborate and research on further possibilities.
We have used our algorithms to extract the textual information from whatsapp logs
and stored it in a sqlite database file named as "social conversation.db".
This dataset contains 16225 text messages and 839 distinct users.
We have considered 17 whatsapp groups for extracting the textual information.
Paper: Analysis of foul language usage in social media text conversation
Authors: Sumit Kawate and Kailas Patil
In the Proceedings of the Int. J. Social Media and Interactive Learning Environments (IJSMILE), Vol: 05, Issue: 03, Pages: 227-251, Inderscience, 201
DOI: https://doi.org/10.1504/IJSMILE.2017.087976
The data is stored in an .zip compressed archives. The uncompressed archive is
in 6,020 KB (5.87 MB). Extract with any uncompressed standard software.
## The archive contains the following items: ##
DATABASE/
|
+ Executable/ Directory containing executable files.
|
|
+ Social Conversation.db This file contains records of the database in .db format
|
+ Source Code/ Directory containing source code files.
|
+ Social Conversation (csv).csv This file contains records of the database in .csv format
+ Social Conversation (db).db This file contains records of the database in .db format
+ Social Conversation (html).html This file contains records of the database in .html format.
|
+ Read Me/ Directory containing read me file.
|
|
read me.txt This file contains detail information about dataset.
## The data format of the dataset are: ##
=Table Name= -> CONVERSATION
=Atttributes= =Meaning=
USER_ID User id of the text message
TEXT_MSG TextActual text message
CONTACT_NUMBER Contact number of the user (We have masked the few digits of contact number of the user)
DATE Date of the text message
TIME Time of the text message
=Atttributes= =Format=
USER_ID User Id
TEXT_MSG (text messsage in any format)
CONTACT_NUMBER +contactnumber
DATE dd/mm/yy
TIME hh:mm AM or hh:mm PM
=Atttributes= =Sample Example=
USER_ID User 514
TEXT_MSG Any deal on formal shoes with prime
CONTACT_NUMBER +919xxxxx927
DATE 04/12/17
TIME 1:35 AM
--------------------------------------------------
创建时间:
2019-12-09



