MedSeek_userBehavior
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/2hnjmzpxyd
下载链接
链接失效反馈官方服务:
资源简介:
The target dataset contains de-identified, high-resolution interaction information from MedSeek, a large-language-model (LLM) platform optimised for medical education. Built on the DeepSeek architecture and fine-tuned with >200 M clinically curated instruction–response pairs, MedSeek achieves state-of-the-art accuracy on multiple medical NLP benchmarks, including MedQA (78.6 %), PubMedQA (83.9 %), MedMCQA (67.4 %), MedBullets (72.1 %), MMLU (81.2 %), MMLU-Pro (79.5 %) and CARE-QA (74.8 %). This dataset captures usage patterns from a medical education large language model (LLM) platform, representing interaction behaviors during Q2 2025. It contains anonymized observational records of platform engagement across diverse medical education contexts.
#### Dataset Components:
1. **User Profiles** (`medical_llm_users.csv`)
* 1,454 anonymized participant records
* Role distribution: Educators (2.2%), Medical students (97.8%)
* Discipline representation: Clinical Medicine (39%), Pharmacy (18%), Public Health (11%), Basic Medicine (26%), Nursing (4%), Medical Humanities (2%)
* Engagement tiers: High-engagement (15%), Regular (25%), Low-frequency (40%), Dormant (20%)
2. **Session Records** (`medical_llm_sessions.csv`)
* Platform access sessions with temporal metadata
* Device access patterns (mobile/desktop/tablet)
* Duration metrics and temporal distribution
* Special annotation for examination period (May 10-24, 2025)
3. **Interaction Logs** (`medical_llm_interactions.csv`)
* Question-Answer exchanges across medical domains
* Six knowledge domains with topic classifications
* Interaction types: Initial queries (35%), Follow-ups (25%), Answer review (20%), Clarifications (10%), Content saving (5%), Feedback (5%)
* Complexity engagement metrics
#### Data Harness:
Data was harnessed through parameterized behavioral modeling based on established medical education frameworks. The process incorporates:
* Professionally validated medical education taxonomies
* Temporal usage distributions reflecting academic calendars
* Device access patterns aligned with mobility studies
* Knowledge domain representations mirroring standard curricula
#### Potential Research Applications:
* Medical education technology adoption studies
* Temporal analysis of learning behaviors
* Domain-specific knowledge retrieval patterns
* Adaptive learning system development
* Educational data mining methodology validation
#### Ethical Compliance:
All identifiers represent anonymized entities. Content follows medical education standards without including real patient information or personally identifiable data. Generated text reflects generalized medical education scenarios without specific case references.
创建时间:
2025-08-05



