five

SINAI/SA-Corpus

收藏
Hugging Face2024-03-22 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/SINAI/SA-Corpus
下载链接
链接失效反馈
官方服务:
资源简介:
SINAI-SA语料库由SINAI小组在2008年12月通过跟踪亚马逊网站创建,包含了来自不同相机的近2000条评论。这些评论根据星级被分为五个目录,每个目录包含相应星级的评论文件。具体评论数量如下:1星78条,2星67条,3星97条,4星411条,5星1290条,总计1943条评论。涉及的相机品牌包括Canon、Kodak、Nikon、Olympus和Pentax等。

SINAI-SA语料库由SINAI小组在2008年12月通过跟踪亚马逊网站创建,包含了来自不同相机的近2000条评论。这些评论根据星级被分为五个目录,每个目录包含相应星级的评论文件。具体评论数量如下:1星78条,2星67条,3星97条,4星411条,5星1290条,总计1943条评论。涉及的相机品牌包括Canon、Kodak、Nikon、Olympus和Pentax等。
提供机构:
SINAI
原始信息汇总

数据集概述

名称: SINAI-SA Corpus

创建时间: 2008年12月

创建者: SINAI group

数据来源: 从Amazon网站提取的近2,000条关于不同相机的评论

数据结构: 数据集包含5个目录,分别代表评论的星级(1星至5星)。每个目录包含一个纯文本文件,每个文件对应一条评论。

评论数量:

  • 1星: 78条
  • 2星: 67条
  • 3星: 97条
  • 4星: 411条
  • 5星: 1,290条 总计: 1,943条

涉及相机品牌及评论数量:

  • CanonA590IS: 400条
  • CanonA630: 300条
  • CanonSD1100IS: 426条
  • KodakCx7430: 64条
  • KodakV1003: 95条
  • KodakZ740: 155条
  • Nikon5700: 119条
  • Olympus1030SW: 168条
  • PentaxK10D: 126条
  • PentaxK200D: 90条 总计: 1,943条

许可证: Apache-2.0 License

引用信息: bibtex @article{RUSHDISALEH201114799, title = {Experiments with SVM to classify opinions in different domains}, journal = {Expert Systems with Applications}, volume = {38}, number = {12}, pages = {14799-14804}, year = {2011}, issn = {0957-4174}, doi = {https://doi.org/10.1016/j.eswa.2011.05.070}, url = {https://www.sciencedirect.com/science/article/pii/S0957417411008542}, author = {M. {Rushdi Saleh} and M.T. Martín-Valdivia and A. Montejo-Ráez and L.A. Ureña-López}, keywords = {Opinion mining, Machine learning, SVM, Corpora}, abstract = {Recently, opinion mining is receiving more attention due to the abundance of forums, blogs, e-commerce web sites, news reports and additional web sources where people tend to express their opinions. Opinion mining is the task of identifying whether the opinion expressed in a document is positive or negative about a given topic. In this paper we explore this new research area applying Support Vector Machines (SVM) for testing different domains of data sets and using several weighting schemes. We have accomplished experiments with different features on three corpora. Two of them have already been used in several works. The last one has been built from Amazon.com specifically for this paper in order to prove the feasibility of the SVM for different domains.} }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作