Zero-shot Bilingual App Reviews Mining with Large Language Models
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/records/11066414
下载链接
链接失效反馈官方服务:
资源简介:
Classification
6000 English and 6000 French user reviews from three applications on Google Play (Garmin Connect, Huawei Health, Samsung Health) are labelled manually. We employed three labels: problem report, feature request, and irrelevant.
Problem reports show the issues the users have experienced while using the app.
Feature requests reflect the demande of users on new function, new content, new interface, etc.
Irrelevant are the user reviews that do not belongs to the two aforementioned categories.
As we can observe from the following table, that shows examples of labelled user reviews, each review belongs to one or more categories.
App
Language
Total
Feature request
Problem report
Irrelevant
Garmin Connect
en
2000
223
579
1231
Garmin Connect
fr
2000
217
772
1051
Huawei Health
en
2000
415
876
764
Huawei Health
fr
2000
387
842
817
Samsung Health
en
2000
528
500
990
Samsung Health
fr
2000
496
492
1047
Clustering
1200 bilingual labeled user reviews for clustering evaluation. From each of the three applications and for each of the two languages present in the classification dataset, we randomly selected 100 problem reports and 100 feature requests. Subsequently, we conducted manual clustering on each collection of 200 bilingual reviews, all of which pertained to the same category.
Garmin Connect
Huawei Health
Samsung Health
#clusters in feature request
89
74
69
#clusters(𝑠𝑖𝑧𝑒≥5) in feature request
7
9
11
#clusters in problem report
45
44
41
#clusters(𝑠𝑖𝑧𝑒≥5) in problem report
10
13
12
创建时间:
2024-05-23



