five

Corpus CSV

收藏
Figshare2021-10-15 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Corpus_CSV/16745986/1
下载链接
链接失效反馈
官方服务:
资源简介:
This file describes the corpus in a CSV format using pipe character as separator. The file includes the following columns:<br>- en: The words in English that composes the sentence;<b>- pt_br:</b> The words in Portuguese that composes the sentence;<b>- type:</b> The type of the sentence (OBJ for objective and SUBJ for subjective);<b>- pol:</b> The polarity of the sentence if it is a subjective sentence (-1, 0 or 1).<b>- en_path:</b> The path in OpenSubtitles related to the sentence in English;<b>- pt_br_path:</b> The path in OpenSubtitles related to the sentence in Portuguese;

本文件以逗号分隔值(Comma-Separated Values,CSV)格式描述该语料库,且以竖线(|)作为列分隔符。该文件包含以下列: - en:构成该语句的英文文本; - pt_br:构成该语句的巴西葡萄牙语文本; - type:语句类型(OBJ表示客观语句,SUBJ表示主观语句); - pol:当语句为主观语句时,其情感极性取值为-1、0或1; - en_path:该英文语句在OpenSubtitles中的对应路径; - pt_br_path:该葡萄牙语语句在OpenSubtitles中的对应路径;
提供机构:
Friedrich Kuwaki, Vinicius Takeo
创建时间:
2021-10-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作