Domain-specific Topic Modeling Configurations for Explainable Document-level Translation Evaluation
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/k65v3vxwhb
下载链接
链接失效反馈官方服务:
资源简介:
1. LSA_elbow_points.pdf: Contains per-domain elbow points derived from singular value decay curves (up to 30 components). Used to determine the optimal number of LSA topics for each domain.
2. LSA_full_singular_values.pdf: Provides complete singular value distributions for all domains and both languages (Korean/English). Shows variance contribution across components and supports dimensionality analysis.
3. LSA_max_features.pdf: Shows max-feature sensitivity tests across domains to determine the optimal TF-IDF vocabulary size. Includes comparisons of 500/1000/10000 features and their impact on stability.
4. LDA_coherence.pdf: Shows per-domain LDA coherence (c_v) across 1–30 topics and marks the selected optimal topic count k with a red dashed line for choosing the number of LDA topics.
创建时间:
2026-01-05



