five

MedSeek_userBehavior

收藏
Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/2hnjmzpxyd
下载链接
链接失效反馈
官方服务:
资源简介:
The target dataset contains de-identified, high-resolution interaction information from MedSeek, a large-language-model (LLM) platform optimised for medical education. Built on the DeepSeek architecture and fine-tuned with >200 M clinically curated instruction–response pairs, MedSeek achieves state-of-the-art accuracy on multiple medical NLP benchmarks, including MedQA (78.6 %), PubMedQA (83.9 %), MedMCQA (67.4 %), MedBullets (72.1 %), MMLU (81.2 %), MMLU-Pro (79.5 %) and CARE-QA (74.8 %). This dataset captures usage patterns from a medical education large language model (LLM) platform, representing interaction behaviors during Q2 2025. It contains anonymized observational records of platform engagement across diverse medical education contexts. #### Dataset Components: 1. **User Profiles** (`medical_llm_users.csv`) * 1,454 anonymized participant records * Role distribution: Educators (2.2%), Medical students (97.8%) * Discipline representation: Clinical Medicine (39%), Pharmacy (18%), Public Health (11%), Basic Medicine (26%), Nursing (4%), Medical Humanities (2%) * Engagement tiers: High-engagement (15%), Regular (25%), Low-frequency (40%), Dormant (20%) 2. **Session Records** (`medical_llm_sessions.csv`) * Platform access sessions with temporal metadata * Device access patterns (mobile/desktop/tablet) * Duration metrics and temporal distribution * Special annotation for examination period (May 10-24, 2025) 3. **Interaction Logs** (`medical_llm_interactions.csv`) * Question-Answer exchanges across medical domains * Six knowledge domains with topic classifications * Interaction types: Initial queries (35%), Follow-ups (25%), Answer review (20%), Clarifications (10%), Content saving (5%), Feedback (5%) * Complexity engagement metrics #### Data Harness: Data was harnessed through parameterized behavioral modeling based on established medical education frameworks. The process incorporates: * Professionally validated medical education taxonomies * Temporal usage distributions reflecting academic calendars * Device access patterns aligned with mobility studies * Knowledge domain representations mirroring standard curricula #### Potential Research Applications: * Medical education technology adoption studies * Temporal analysis of learning behaviors * Domain-specific knowledge retrieval patterns * Adaptive learning system development * Educational data mining methodology validation #### Ethical Compliance: All identifiers represent anonymized entities. Content follows medical education standards without including real patient information or personally identifiable data. Generated text reflects generalized medical education scenarios without specific case references.
创建时间:
2025-08-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作