SINAI/SA-Corpus
收藏数据集概述
名称: SINAI-SA Corpus
创建时间: 2008年12月
创建者: SINAI group
数据来源: 从Amazon网站提取的近2,000条关于不同相机的评论
数据结构: 数据集包含5个目录,分别代表评论的星级(1星至5星)。每个目录包含一个纯文本文件,每个文件对应一条评论。
评论数量:
- 1星: 78条
- 2星: 67条
- 3星: 97条
- 4星: 411条
- 5星: 1,290条 总计: 1,943条
涉及相机品牌及评论数量:
- CanonA590IS: 400条
- CanonA630: 300条
- CanonSD1100IS: 426条
- KodakCx7430: 64条
- KodakV1003: 95条
- KodakZ740: 155条
- Nikon5700: 119条
- Olympus1030SW: 168条
- PentaxK10D: 126条
- PentaxK200D: 90条 总计: 1,943条
许可证: Apache-2.0 License
引用信息: bibtex @article{RUSHDISALEH201114799, title = {Experiments with SVM to classify opinions in different domains}, journal = {Expert Systems with Applications}, volume = {38}, number = {12}, pages = {14799-14804}, year = {2011}, issn = {0957-4174}, doi = {https://doi.org/10.1016/j.eswa.2011.05.070}, url = {https://www.sciencedirect.com/science/article/pii/S0957417411008542}, author = {M. {Rushdi Saleh} and M.T. Martín-Valdivia and A. Montejo-Ráez and L.A. Ureña-López}, keywords = {Opinion mining, Machine learning, SVM, Corpora}, abstract = {Recently, opinion mining is receiving more attention due to the abundance of forums, blogs, e-commerce web sites, news reports and additional web sources where people tend to express their opinions. Opinion mining is the task of identifying whether the opinion expressed in a document is positive or negative about a given topic. In this paper we explore this new research area applying Support Vector Machines (SVM) for testing different domains of data sets and using several weighting schemes. We have accomplished experiments with different features on three corpora. Two of them have already been used in several works. The last one has been built from Amazon.com specifically for this paper in order to prove the feasibility of the SVM for different domains.} }



