Paper

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Clinical decision-making routinely demands reasoning over heterogeneous data, yet existing multimodal language models (MLLMs) remain largely vision-centric and fail to generalize across clinical specialties. To bridge this gap, we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with Domain-aware Relative Policy Optimization (DRPO), a novel reinforcement-learning objective that hierarchically scales normalized rewards according to domain rarity and modality difficulty, mitigating performance imbalance caused by skewed clinical data distributions. Trained on 2.61 million instruction tuning pairs spanning 9 clinical domains, we show that DRPO training boosts diagnostic performance by 43% in macro-F1 on average across all visual domains as compared to other critic-free training methods like GRPO. Furthermore, with QoQ-Med trained on intensive segmentation data, it is able to highlight salient regions related to the diagnosis, with an IoU 10x higher than open models while reaching the performance of OpenAI o4-mini.

Download:

Paper at accepted to NeurIPS 2025

Format

Paper shared 1 week ahead for pre‑reading
10–15 min summary by a volunteer (no slides — we’ll scroll through the PDF together)
Open discussion on methodology, strengths/weaknesses, clinical relevance, and future directions
Papers chosen collaboratively to balance foundational ML research and cutting‑edge medical AI
Who should attend? Researchers, clinicians, students — anyone curious about ML/AI in medicine.
How to join? Simply show up (no registration required). Looking forward to seeing you there!

Organizers

Questions or paper suggestions? Contact us via Email.

ML/AI Journal Club

Paper

QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training

Format

Organizers