five

Domain-specific Topic Modeling Configurations for Explainable Document-level Translation Evaluation

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/k65v3vxwhb
下载链接
链接失效反馈
官方服务:
资源简介:
1. LSA_elbow_points.pdf: Contains per-domain elbow points derived from singular value decay curves (up to 30 components). Used to determine the optimal number of LSA topics for each domain. 2. LSA_full_singular_values.pdf: Provides complete singular value distributions for all domains and both languages (Korean/English). Shows variance contribution across components and supports dimensionality analysis. 3. LSA_max_features.pdf: Shows max-feature sensitivity tests across domains to determine the optimal TF-IDF vocabulary size. Includes comparisons of 500/1000/10000 features and their impact on stability. 4. LDA_coherence.pdf: Shows per-domain LDA coherence (c_v) across 1–30 topics and marks the selected optimal topic count k with a red dashed line for choosing the number of LDA topics.
创建时间:
2026-01-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作