five

TaxVectors.zip

收藏
DataCite Commons2021-02-04 更新2025-04-16 收录
下载链接:
https://archive.data.jhu.edu/file.xhtml?persistentId=doi:10.7281/T1/N1X6I4/SZF0HP
下载链接
链接失效反馈
官方服务:
资源简介:
500-dimension vector representation for tax-law terms and collocations (e.g. “tax year”, which is represented as “tax_year”) derived using (Mikolov 2013)’s word2vec implementation using skip-gram with negative sampling; words with a frequency of less than 10 were discarded; 5 iterations through the data; 15 negative samples were used per focus word; words with a unigram probability above 10^-3 were probabilistically discarded; only static windows were used; the training data was all tax-law documents, specifically the curated tax corpus (PLRs and Tax Court unreported decisions) plus tax-specific cases in the Federal case.law corpus.
提供机构:
Johns Hopkins University Data Archive
创建时间:
2020-07-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作