five

datasets-CNRS/oral-arguments-scotus

收藏
Hugging Face2025-04-28 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/datasets-CNRS/oral-arguments-scotus
下载链接
链接失效反馈
官方服务:
资源简介:
该语料库包含112场口头辩论会议记录,每场会议对应一个案件,通常包括两场口头辩论,有时还包括一场反驳辩论。这些会议记录的时间跨度从2018年10月9日的Stokeling v. United States案件到2020年3月4日的June Medical Services L.L.C. v. Russo案件。整个语料库共有1,425,905个词汇。语料库提供PDF格式(源文件)、XML格式(文件分段、章节和发言轮次,并对发言者进行注释)、TXM和XML-TXM格式(单词分段、由TreeTagger进行词性标注、按文件、章节(一个章节=一场口头辩论)和发言轮次进行划分)。

This corpus includes 112 sessions of oral arguments, with each session corresponding to a case and generally consisting of two oral arguments, and sometimes a rebuttal argument. The records span from the Stokeling v. United States case on October 9, 2018, to the June Medical Services L.L.C. v. Russo case on March 4, 2020. The entire corpus comprises 1,425,905 tokens. The corpus is available in PDF format (source files), XML format (segmentation of files, sections, and turns of speech with annotation on the speaker), and TXM and XML-TXM format (segmentation into words, part-of-speech tagging by TreeTagger, divided into files, sections (one section = one oral argument), and turns of speech).
提供机构:
datasets-CNRS
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作